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(57) Abstract 

The present invention relates to the three- 
dimensional structures of a protein tyrosine kinase 
optionally completed with one or more compounds. 
The atomic coordinates that define the structures of 
the protein tyrosine kinase and any of the compounds 
bound to it are pertinent to methods for determining 
the three-dimensional structures of protein tyrosine 
kinases with unknown structure and to methods 
that identify modulators of protein tyrosine kinase 
functions. 
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DESCRIPTION 

CRYSTAL STRUCTURES OF A PROTEIN TYROSINE KINASE 

5 RELATED A PPLICATIONS 

This application is related to U.S. Application 
Serial No. 08/701,191, by Mohammadi , et al . , entitled 
"Crystals of the Tyrosine Kinase Domain of Non- Insulin 
Receptor Tyrosine Kinases," filed August 21, 1996 (Lyon 

10 & Lyon Docket No. 227/088) and U.S. Application Serial 
No. 60/034,168, by McMahon, et al . , entitled "Crystal 
Structures of a Protein Tyrosine Kinase Complexed with 
Compounds of the Oxindolinone/Thiolindolinone Family," 
filed December 19, 1996 (Lyon & Lyon Docket No. 

15 221/282) , which are hereby incorporated herein by 

reference in their entirety including any drawings, 
tables, and figures. 

INTRODUCTION 

20 The present invention relates to the three 

dimensional structures of protein kinases. 

BACKGROUND OF THE INVENTION 
The following description of the background of the 
25 invention is provided simply as an aid in understanding 
the invention and is not admitted to describe or 
constitute prior art to the invention. 

Protein tyrosine kinases (PTKs) comprise a large 
and diverse class of enzymes (for a review, see 
30 Schlessinger and Ullrich, 1992, Neuron 9: 383-391) . The 
PTK family contains multiple subfamilies, one of which 
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is the fibroblast growth factor receptor (FGF-R) 
subfamily (for a review, see Givol and Yayon, 1992, 
FASEB J. 6 (15): 3362-3369). 

All PTKs enzymat ical ly transfer a high energy 
5 phosphate from adenosine triphosphate to a tyrosine 
residue in a target protein. These phosphorylation 
events regulate cellular phenomena in signal 
transduction processes. Cellular signal transduction 
processes contain multiple steps that convert an 
10 extracellular signal into an intracellular signal. The 
intracellular signal is then converted into a cellular 
response. PTKs are components in many signal 
transduction processes. A PTK regulates the flow of a 
signal in a particular step in the process by 
15 phosphorylating a downstream molecule. The addition of 
a phosphate can either modulate the activity of the 
downstream molecule by turning it "on" or "off". Thus, 
aberrations in a particular PTK's activity can either 
cause overflow or underflow of the signal. Overflow of 
20 a signal can lead to such abnormalities as uncontrolled 
cell proliferation, which is representative of such 
disorders as .cancer and angiogenesis . 

Scientists in the biomedical community are 
searching for PTK inhibitors that down- regulate overflow 
25 signal transduction pathways. In particular, small 

molecule PTK inhibitors are sought that can traverse the 
cell membrane and not become hydrolyzed in acidic 
environments. These small molecule PTK inhibitors can 
be highly bioavailable and can be administered orally to 
30 patients. 

Some small molecule PTK inhibitors have already 
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been discovered. For example, bis (monocyclic) , bicyclic 
or heterocyclic aryl compounds (PCX WO 92/20642), 
vinylene-azaindole derivatives (PCT WO 94/14808), 1- 
cyclopropyl -4 -pyridyl -quinolones (U.S. Patent No. 
5 5,330,992), styryl compounds (U.S. Patent No. 

5,217,999), styryl-substituted pyridyl compounds (U.S. 
Patent No. 5,302,606), certain quinazolme derivatives 
(EP Application No. 0 566 266 Al) , seleoindoles and 
selenides (PCT WO 94/03427), tricyclic polyhydroxylic 
10 compounds (PCT WO 92/21660), and benzylphosphonic acid 
compounds (PCT WO 91/154 95) are described as PTK 
inhibitors . 

Although many PTK inhibitors are known, many of 
these are not specific for PTK subfamilies and will 

15 therefore cause multiple side-effects as therapeutics. 
Compounds of the indolinone family, however, are 
specific for the FGFR subfamily and are non- 
hydrolyzable . WO 96/40116, "Indolinone Compounds for 
the Treatment of Disease," published December 19, 1996, 

20 inventors Tang et al . Although the use of X-ray 
crystallography has provided three dimensional 
structures of other PTKs , they are not complexed with 
PTK subfamily specific, hydrolysis resistant, small 
molecules . 

25 Despite recent advances, the need remains in the 

art for crystallographic analysis of protein kinases, so 
chat improved therapeutic molecules can be designed and 
synthesized. 



30 



pT ^M^PV OR THE INVENTION 
The present invention relates to the three 
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dimensional structures of protein tyrosine kinases. The 
use of X-ray crystallography can define the three 
dimensional structure of protein tyrosine kinase at 
atomic resolution . 
5 The three dimensional structures described herein 

elucidate specific interactions between protein tyrosine 
kinases and compounds bound to them. The coordinates 
that define the three dimensional structures of protein 
tyrosine kinases are useful for determining three 

10 dimensional structures of PTKs with unknown structure. 
In addition, the coordinates are also useful for 
designing and identifying modulators of protein tyrosine 
kinase function. These modulators are potentially 
useful as therapeutics for diseases, including (but 

15 limited to) cell proliferative diseases, such as cancer, 
angiogenesis , atherosclerosis, and arthritis. 

Thus in a first aspect, the invention features a 
crystalline form of a polypeptide corresponding to the 
catalytic domain of a protein tyrosine kinase. 

20 The term "crystalline form," in the context of the 

invention, is a crystal formed from an aqueous solution 
comprising a purified polypeptide corresponding to the 
catalytic domain of a PTK . A crystalline form of a 
protein tyrosine kinase is characterized as being 

25 capable of diffracting x-rays in a pattern defined by 
one of the crystal forms depicted in Blundel et al . , 
1916, Protein Crystallography, Academic Press. A 
crystalline form of a protein kinase is not 
characterized as being capable of diffracting x-rays in 

30 a pattern analogous to a crystalline form consisting of 
primarily salt or primarily a compound, for example. 
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The term "protein tyrosine kinase," or PTK , reters 
to an enzyme that transfers the high energy phosphate of 
adenosine triphosphate to a tyrosine residue located on 
a protein target . 
5 A protein tyrosine kinase catalytic domain of the 

invention can originate from receptor protein tyrosine 
kinases that bind fibroblast growth factor (FGF) . These 
protein tyrosine kinases are known as w FGFR" herein, and 
can relate to one member of the FGFR family, such as 
10 FGFR1 . 

The term "catalytic domain" refers to the region of 
a protein that can exist as a separate entity from the 
protein. The catalytic domain of a protein tyrosine 
kinase is characterized as having considerable amino 

15 acid identity to the catalytic domain of other protein 
tyrosine kinases. Considerable amino acid identity 
preferably refers to at least 30% identity, more 
preferably at least 35% identity, and most preferably at 
least 40% identity. These degrees of amino acid 

20 identity refer to the identity between different protein 
tyrosine kinase families. Ammo acid identity for 
members of a given protein tyrosine kinase family range 
from 55% to 90%. The catalytic domain may be functional 
as a separate entity. The catalytic domain of a protein 

25 tyrosine kinase is also characterized as a polypeptide 
that is soluble in solution. 

The term "identity" identity as used herein refers 
to a property of sequences that measures their 
similarity or relationship. Identity is measured by 

3 0 dividing the number of identical residues in the two 
sequences by the total number of residues and 
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multiplying the product by 100. Thus, two copies of 
exactly the same sequence have 100% identity, but 
sequences that are less highly conserved and have 
deletions, additions, or replacements have a lower 
5 degree of identity. Those skilled in the art will 

recognize that several computer programs are available 
for determining sequence identity. 

The term "functional" refers to the ability of a 
catalytic domain to convert a substrate into a product 
10 by phosphorylat ing the substrate. The term "functional" 
also relates to the ability of a catalytic domain to 
bind natural binding partners. The catalytic region 
may comprise an N- terminal tail, a catalytic core, and a 
C-terminal tail. The catalytic core is a polypeptide 
15 that can be functional in terms of catalysis. N- and C- 
terminal tails are polypeptide regions that may not 
confer appreciable functionality in terms of catalysis, 
but may confer functionality in terms of modulator 
specificity. 

20 A polypeptide can exist as a catalytic domain 

eventhough it is not functional. For example, a 
polypeptide corresponding to a catalytic domain may not 
be functional if it does not harbor phosphate moieties 
in key areas. Multiple examples of phosphorylat ion- 

25 state dependent function are well documented in the art. 
Thw.efore, a catalytic domain can also exist without 
being functional. A measure of a protein kinase 
catalytic domain is a polypeptide that is homologous to 
other protein kinase catalytic domains. 

30 The term ^polypeptide" refers to an amino acid 

chain representing a portion of, cr the entire sequence 
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of, amino acids comprising a protein. 

A preferred embodiment of the invention includes a 

crystalline form of a PTK that is a receptor PTK. 

Receptors are proteins that straddle the inside and 
5 outside of the cell membrane. Receptor PTKs comprise an 

extracellular region, a transmembrane region, and an 

intracellular region comprising a catalytic domain. 

Another preferred embodiment of the invention is 

the crystalline form of a receptor PTK selected from the 
10 group consisting of FGF-R, PDGF-R , FLK, CCK4 , MET, TRKA, 

AXL, TIE, EPK, RYK, DDR, ROS, RET, LTK, R0R1 , and MUSK. 
Yet another preferred embodiment of the invention 

is the crystalline form of a PTK that is a non-receptor 

PTK. Non-receptor PTKs are located inside the cell and 
15 do not harbor extracellular or membrane -spanning 

polypeptides attached to the polypeptide corresponding 

to the catalytic domain. Non-receptor PTKs may harbor 

fatty acids or lipids, which can impart a membrane 

associated character to a PTK. In preferred embodiments 
20 of the invention, crystalline forms of non-receptor PTKs 

are selected from the group consisting of SRC, BRK, BTK, 

CSK, ABL, ZAP70 , FES, FAK, JAK, and ACK . 

In still another preferred embodiment, the 

invention features a crystalline form of a PTK that 
25 comprises a heavy metal atom. These types of crystals 

can be referred to as derivative crystals. 

The term "derivative crystal" refers to a crystal 

where the polypeptide is in association with one or more 

heavy-metal atoms. 
30 The term "association" refers to a condition of 

proximity between a chemical entity or compound, or 
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portions or fragments thereof, and tyrosine Kinase 
domain protein, or portions or fragments thereof. Tne 
association may be non- covalent , i.e., where the 
juxtaposition is energetically favored by, e.g., 
hydrogen-bonding, van der Waals, electrostatic or 
hydrophobic interactions, or it may be covalent. 

The term "heavy metal atom" refers to an atom that 
is a transition element, a lanthanide metal, or an 
actinide metal. Lanthanide metals include elements with 
atomic numoers between 57 and 71, inclusive. Actinide 
metals include elements with atomic numbers between 89 
and 103, inclusive. 

In a preferred embodiment, the invention features a 
crystal of an FGF receptor tyrosine kinase domain 
protein. The FGF receptor tyrosine kinase domain 
protein can relate to FGFR1 . 

The term " FGFR1 " refers to one member of multiple 
receptor PTKs that are homologous to one another and 
bind FGF. In this context, the term "homologous" refers 
to at least 70% amino acid identity between two members 
of the FGFR family. 

The term w FGFRl" can also refer to a mutant of 
human FGFR1 which is characterized by the amino acid 
sequence of SEQ ID NO: 2. As compared to human FGFR1 , 
FGFRl contains the following amino acid substitutions: 
Cys-488 - Ala, Cys-584 - Ser, Leu-457 - Val, and has an 
additional five amino acid residues at the N-terminus 
(Ser-Aia-Ala-Gly-Thr) . 

The term "human FGFRl" refers to the tyrosine 
kinase domain of human fibroblast growth factor receptor 
1 ("FGFRl") having the amino acid sequence of SEQ ID 
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NO:!. Generally, human FGFRi comprises a 310 ammo acid 
residue fragment (residues 456 to 765) of human FGFRI . 

The term "mutant" refers to a polypeptide which is 
obtained by replacing at least one amino acid residue in 
5 a native tyrosine kinase domain with a different amino 
acid residue. Mutation can be accomplished by adding 
and/or deleting amino acid residues within the native 
polypeptide or at Che N- and/or C-terminus of a 
polypeptide corresponding to a native tyrosine kinase 

10 domain having substantially the same three-dimensional 
structure as the native tyrosine kinase domain from 
which it is derived. By having substantially the same 
three-dimensional structure is meant having a set of 
atomic structure coordinates that have a root mean 

15 square deviation (r.m.s.d.) of less than or equal to 
about 2 A when superimposed with the atomic structure 
coordinates of the native tyrosine kinase domain from 
which the mutant is derived when at least about 50% to 
100% of the Ca atoms of the native tyrosine kinase are 

20 included in the superposition. A mutant may have, but 
need not have, PTK activity. 

In another preferred embodiment, the invention 
relates to a crystalline form defined by the structural 
coordinates set forth in Table 1. 

25 The term "atomic structural coordinates" as used 

herein refers to a data set that defines the three 
dimensional structure of a molecule or molecules. 
Structural coordinates can be slightly modified and 
still render nearly identical three dimensional 

30 structures. A measure of a unique set of structural 
coordinates is the root -mean-square deviation of the 



WO 98/07835 



PCT/US97/14885 



10 

resulting structure. Structural coordinates that render 
three dimensional structures that deviate from one 
another by a root -mean-square deviation of less than 1.5 
A may be viewed by a person of ordinary skill in the art 
as identical. Hence, the structural coordinates set 
forth in Table 1, Table 2, Table 3, and Table 4 are not 
limited to the values defined therein. 

In other preferred embodiments, the invention 
features a crystalline form of the polypeptide in 
association with a compound. These types of crystalline 
forms can be referred to as co-crystals. The compound 
may be a cof actor, substrate, substrate analog, 
inhibitor, or allosteric effector. 

The term "compound" refers to an organic molecule. 
The term "organic molecule" refers to a molecule which 
has at least one carbon atom in its structure. The 
compound can have a molecular weight of less than 6kDa. 
Both the geometry of the compound and the interactions 
formed between the compound and the polypeptide 
preferably govern high affinity binding between the two 
molecules. High affinity binding is preferably governed 
by a dissociation equilibrium constant on the order of 
10' 6 M or less. The compound is preferably a modulator 
that alters the function of a PTK. 

The term "function," in reference to the effect of 
a modulator on PTK function, refers to the ability of a 
modulator to enhance or inhibit the catalytic activity 
of a PTK. 

The term "catalytic activity", in the context of 
the invention, defines the ability of a PTK to 
phosphorylate a substrate polypeptide. Catalytic 
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activity can be measured, for example, by determining 
the amount of a substrate converted to a product as a 
function of time. The conversion of the substrate to a 
product occurs at the active-site of the PTK. 
5 The term "active-site" refers to a cavity located 

in the PTK in which one or more substrate molecules may 
bind. Addition of a modulator to cells expressing a PTK 
may enhance (activate) or lower (inhibit) the catalytic 
activity of the PTK. 

10 A small number of inhibitors of PTK catalytic 

activity are known in the art. Small molecule 
inhibitors may modulate PTK function by blocking the 
binding of substrates. Indolinone compounds, for 
example, may bind to the active-site of PTK catalytic 

15 domains and inhibit them effectively, as measured by 
inhibition constants on the order of 10 6 M or less. 

Activators of PTK intracellular regions can enhance 
PTK function by interacting with both the PTK catalytic 
domain and the substrate. Activators may also promote 

20 dimerization of PTKs and thus activate them by bringing 
them into close proximity with one another. In 
addition, activators may operate by promoting a 
conformational change in the intracellular region of the 
PTK such that the catalytic region modifies substrates 

25 at a faster rate in the presence of the activator. 

The term "function" can also refer to the ability 
of a modulator to enhance or inhibit the association 
between a PTK and a natural binding partner. 

The term "natural binding partner" refers to a 

30 polypeptide that normally binds to a PTK in a cell. 
These natural binding partners can play a role in 



WO 98/07835 



PCT/TS97/ 14885 



propagating a signal in a PTK signal transduction 
process. The natural binding partner can bind to a PTK 
with high affinity. High affinity represents an 
equilibrium binding constant on the order of 10 6 M or 
5 less. However, a natural binding partner can also 

transiently interact with a PTK and chemically modify 
it. PTK natural binding partners are chosen from a 
group consisting of, but not limited to, src homology 2 
(SH2) or 3 (SH3) domains, other phosphoryl tyrosine 

10 binding (PTB) domains, nucleotide exchange factors, and 
other protein kinases or protein phosphatases. 

The term "interactions" refers to hydrophobic, 
aromatic, and ionic forces and hydrogen bonds formed 
between atoms in the modulator and the enzyme active- 

15 site. 

The term "cofactor" refers to a compound that may, 
in addition to the substrate, bind to a protein and 
undergo a chemical reaction. Multiple co- factors are 
nucleotides or nucleotide derivatives, such as phosphate 

20 and nicotinamide derivatives of adenosine. 

The term "substrate" refers to a compound that 
reacts with an enzyme. Enzymes can catalyze a specific 
reaction on a specific substrate. For example, PTKs can 
phosphorylate specific protein and peptide substrates on 

25 tyrosine moieties. In addition, nucleotides can act as 
substrates for protein kinases. 

The term "substrate analog" refers to a compound 
that is structurally similar, but not identical, to a 
substrate. The substrate analog may be a nucleotide 

3 0 analog. Examples of nucleotide analogs are described 
below. 
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The term "inhibitor" refers to a compound that 
decreases the cellular function of a protein kinase. 
The protein kinase function is preferably the 
interaction with a natural binding partner and more 
5 preferably catalytic activity. 

The term "allosteric effector" refers to a compound 
that causes allosteric interactions in a protein. The 
term "allosteric interactions" refers to interactions 
between separate sites on a protein. The sites can be 
10 different from the active site. The allosteric effector 
can enhance or inhibit catalytic activity by binding to 
a site that may be different than the active site. 

The term "co-crystal" refers to a crystal where the 
polypeptide is in association with one or more 

15 compounds. 

In preferred embodiments, a co-crystal of the 
invention can be in association with a heavy metal atom. 
Examples of heavy metal atoms are described above. 
In other preferred embodiments, the invention 
20 features a co-crystal comprising the crystalline form of 
the polypeptide in association with a compound, where 
the compound is a non-hydrolyzable analog of ATP. These 
analogs can be referred to as nucleotide analogs. 

The term U ATP" refers to the chemical compound 
25 adenosine triphosphate. 

The term "non-hydrolyzable" refers to a compound 
having a covalent bond that does not readily react with 
water. Examples of non-hydrolyzable analogs of ATP are 
AMP-PNP and AMP-PCP, whose structures are well known to 
30 those skilled in the art. 

The term ,, AMP-PNP U refers to adenylyl 
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imidodiphosphate , a non- hydroiyzable analog of ATP. 

The term "AMP-PCP M refers to adenylyl 
diphosphonate , a non- hydrol yzable analogue of ATP. 

In another preferred embodiment, the invention 
relates to a crystalline form defined by the structural 
coordinates set forth in Table 2. 

In preferred embodiments, the invention relates to 
crystalline forms, where the compound in association 
with the polypeptide is an indolinone. 

Certain indolinones are specific modulators of PTK 
function. A preferred embodiment of the invention is 
the crystalline form of a PTK complexed with an 
indolinone of formula I or II: 



15 




(I) 
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or a pharmaceutical^ acceptable salt, isomer, 

metabolite, ester, amide, or prodrug thereof, where: 

(a) A lf A 2 , A 3 , and A 4 are independently carbon or 
5 nitrogen; 

(b) R x is hydrogen or alkyla- 
te) R 2 is oxygen in the case of an oxindolinone or 

sulfur in the case of a thiol indolinone ; 
(d) R 3 is hydrogen ; 

10 (e) R 4 , R 5 , R 6 , and R 7 are optionally present, and 

are either (i) independently selected from the group 
consisting of alkyl, alkoxy, aryl , aryloxy, alkaryl, 
alkaryloxy, halogen, trihalomethyl , S(0)R, S0 2 NRR 1 , SO,R, 
SR, N0 2/ NRR ' , OH, CN, C(0)R, OC(0)R, NHC(0)R, (CH 2 ) n CC 2 R, 

15 and CONRR 1 or (ii) any two adjacent R 4 , R 6 , R 6 , and R 7 

taken together form a fused ring with the aryl portion 
of the indole-based portion of the indolinone ; 

(f) R 2 ', R 3 \ R 4 \ Rb 1 / and R 6 ' are each 
independently selected from the group consisting of 

2 0 hydrogen, alkyl, alkoxy, aryl, aryloxy, alkaryl, 

alkaryloxy, halogen, trihalomethyl, S(0)R, S0 2 NRR 1 , S0 3 R, 
SR, NO,, NRR', OH, CN , C{0)R, 0C(O)R, NHC(0)R, (CH 2 ) n C0 2 R, 
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ana CONRR ' ; 

(g) n is 0, 1, 2, or 3; 

(h) R is hydrogen, aikyl or aryl; 

(i) R ' is hydrogen, aikyl or aryl; and 

5 (j) A is a five membered heteroaryl ring selected 

from the group consisting of thiophene , pyrrole, 
pyrazole , imidazole , 1,2,3- triazole , 1,2, 4 - triazole, 
oxazole, isoxazole, thiazole, isothiazole, furan, 1,2,3- 
oxadiazole # 1,2, 4 -oxadiazole , 1,2, 5 -oxadiazole , 1,3,4- 

10 oxadiazole , 1,2,3,4 -oxa triazole , 1,2,3, 5 -oxa triazole , 

1,2, 3 -thiadiazole , 1 f 2 , 4 - thiadiazole , 1,2, 5 - thiadiazole , 
1 , 3 , 4 -thiadiazole, 1,2, 3 , 4 - thiatnazole , 1,2,3,5- 
thiatriazole, and tetrazole, optionally substituted at 
one or more positions with aikyl, alkoxy, aryl, aryloxy, 

15 alkaryl, alkaryloxy, halogen, trihalomethyl , S(0)R, 
S0 2 NRR', S0 3 R, SR, N0 2/ NRR 1 , OH, CN, C(0)R, OC(0)R, 
NHC(0)R, (CH 2 ) n C0 2 R or CONRR'. 

The term "pharmaceutically acceptable salt" refers 
to those salts which retain the biological activity and 

20 properties of the free bases. Pharmaceut ically 

acceptable salts can be obtained by reaction with 
inorganic acids such as hydrochloric acid, hydrobromic 
acid, sulfuric acid, nitric acid, phosphoric acid, 
methanesulf onic acid, ethanesulf onic acid, p- 

25 toluenesulf onic acid, salicylic acid and the like. 

The term "prodrug" refers to an agent that is 
converted into the parent drug in vivo. Prodrugs may be 
easier to administer than the parent drug in some 
situations. For example, the prodrug may be 

30 bioavailable by oral administration but the parent is 

not, or the prodrug may improve solubility to allow for 
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intravenous administrat ion . 

"Alkyl" refers to a straight-chain, Dranched or 
cyclic saturated aliphatic hydrocarbon. Preferably, the 
alkyl group has 1 to 12 carbons. More preferably, it is 
5 a lower alkyl of from 1 to 7 carbons, more preferably 1 
to 4 carbons. Typical alkyl groups include methyl, 
ethyl, propyl, isopropyl, butyl, isobutyl, tertiary 
butyl, pentyl, hexyl and the like. The alkyl group may 
be optionally substituted with one or more substituents 
10 are selected from the group consisting of hydroxyl , 

cyano, alkoxy, =0, =S, N0 2 , halogen, N(CH 3 ) 2 amino, and 
SH. 

"Alkenyl" refers to a straight -chain, branched or 
cyclic unsaturated hydrocarbon group containing at least 
15 one carbon-carbon double bond. Preferably, the alkenyl 
group has 2 to 12 carbons. More preferably it is a 
lower alkenyl of from 2 to 7 carbons, more preferably 2 
to 4 carbons. The alkenyl group may be optionally 
substituted with one or more substituents selected from 

2 0 the group consisting of hydroxyl, cyano, alkoxy, =0, =S, 

N0 2 , halogen, N(CH 3 ) 2 amino, and SH. 

"Alkynyl" refers to a straight -chain, branched or 
cyclic unsaturated hydrocarbon containing at least one 
carbon-carbon triple bond. Preferably, the alkynyl 
25 group has 2 to 12 carbons. More preferably it is a lower 
alkynyl of from 2 to 7 carbons, more preferably 2 to 4 
carbons. The alkynyl group may be optionally 
substituted with one or more substituents selected from 
the group consisting of hydroxyl, cyano, alkoxy, =0, =S, 

3 0 N0 2 , halogen, N(CH 3 ) 7 amino, and SH. 

"Alkoxy" refers to an "0-alkyl" group. 
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"Aryl" refers to an aromatic group which has at 
Least one ring having a conjugated pi -electron system 
and includes carbocyclic aryl, heterocyclic aryl and 
biaryl groups. The aryl group may be optionally 
5 substituted with one or more substituents selected from 
the group consisting of halogen, t rihalomethyl , 
hydroxyl, SH, OH, N0 2 , amine, thioether, cyano, alkoxy, 
alkyl, and amino. 

"Alkaryl" refers to an alkyl that is covalently 
10 joined to an aryl group. Preferably, the alkyl is a 
lower alkyl . 

"Carbocyclic aryl" refers to an aryl group wherein 
the ring atoms are carbon. 

"Heterocyclic aryl" refers to an aryl group having 
IS from 1 to 3 heteroatoms as ring atoms, the remainder of 
the ring atoms being carbon. Heteroatoms include 
oxygen, sulfur, and nitrogen. Thus, heterocyclic aryl 
groups include furanyl, thienyl, pyridyl , pyrrolyl , N- 
lower alkyl pyrrolo, pyrimidyl, pyrazinyl, imidazolyl 
20 and the like. 

"Amide" refers to -C(0)-NH-R, where R is alkyl, 
aryl, alkylaryl or hydrogen. 

"Thioamide" refers to -C(S)-NH-R, where R is alkyl, 
aryl, alkylaryl or hydrogen. 
25 "Amine" refers to a -N(R')R' ' group, where R' and 

R 11 are independently selected from the group consisting 
of alkyl, aryl, and alkylaryl. 

,1 Thioether" refers to -S-R, where R is alkyl, aryl, 
or alkylaryl. 

30 "Sulfonyl" refers to -S(0) 2 -R, where R is aryl, 

C (CN) =C-aryl , CH ? CN, alkyaryl, sulfonamide, NH-alkyl, NH- 
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alkylaryl, or NH-aryl . 

The term "acyl" denotes groups -C(0)R, where R is 
alkyl as defined above, such as formyl, acetyl, 
propionyl, or butyryl . 
5 It is understood by those skilled in the art that 

when A w A 2 , , and A< are nitrogen or sulfur that the 
corresponding R«, R 5 , R 6 , and R 7 , as well as the 
corresponding bond, do not exist. 

Examples of indoles having such fused rings (as 
10 described in (e) (ii) above include the following: 




15 The six membered rings shown above exemplify 

possible A rings in compound II. 
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Other preferred embodiments of the invention are 
crystalline forms comprising 3- [ (3- ( 2 - carboxyethyl ) -4- 
nr.ethylpyrrol-5-yl) methylene] -2- mdolmone as well as 3- 
[4- (4-formylpiperazine-i-yl-)benzylidenyl] -2-indolinone. 
The polypeptide of these crystalline forms can be FGFR , 
and specifically, FGFR1 . 

In preferred embodiments, the crystalline forms of 
the invention can be defined by the structural 
coordinates set forth in Table 3 or Table 4. 

The use of X-ray crystallography can elucidate the 
three dimensional structure of crystalline forms of the 
invention. The first characterization of crystalline 
forms by X-ray crystallography can determine the unit 
cell shape and its orientation in the crystal. 

In other preferred embodiments, the invention 
features a crystal of an FGF receptor tyrosine kinase 
domain protein, where the crystal is characterized by 
having monoclinic unit cells. The crystal may also be 
characterized by having space group symmetry C2 . 

The term "unit cell" refers to the smallest and 
simplest volume element (i.e., parallelpiped-shaped 
block) of a crystal that is completely representative of 
the unit of pattern of the crystal. The dimensions of 
the unit cell are defined by six numbers: dimensions a, 
b and c and angles a, 3 and y* A crystal can be viewed 
as an efficiently packed array of multiple unit cells. 
Detailed descriptions of crystallographic terms are 
described in, which is hereby incorporated herein by 
reference in its entirety, including any drawings, 
figures, and tables. 

The term "monoclinic unit cell" refers to a unit 
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cell where a^b^c ; a-Y- 90°; and 3 > 90°. 

The term "space group M refers to the symmetry of a 
unit cell. In a space group designation (e.g., C2 ) the 
capital letter indicates the lattice type and the other 
5 symbols represent symmetry operations that can be 
carried out on the unit cell without changing its 
appearance . 

The term "lattice" in reference to crystal 
structures refers to the array of points defined by the 
10 vertices of packed unit cells. 

The term "symmetry operations" refers to 
geometrically defined ways of exchanging equivalent 
parts of a unit cell, or exchanging equivalent molecules 
between two different unit cells. Examples of symmetry 
15 operations are screw axes, centers of inversion, and 
mirror planes. 

In a preferred embodiment, the invention features a 
crystalline form, where the monoclinic unit cells have 
dimensions of about a=208.3 A, b=57.8 A, c=65.5 A and 

20 3=107.2°. 

In a preferred embodiment, the invention features a 
FGFR1 crystal, where the monoclinic unit cells have 
dimensions of about a-211.6 A, b=51.3 A, c=66.1 A and 
3=107. 7°. 

25 In another aspect the invention features a 

polypeptide corresponding to the catalytic domain of a 
protein tyrosine kinase, containing at least about 20 
amino acid residues upstream of the first glycine in the 
conserved glycine-rich region of the catalytic domain, 

30 and at least about 17 amino acid residues downstream of 
the conserved arginme located at the C-terminal 
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boundary of the catalytic domain. 

The polypeptides of the invention can be isolated, 
enriched or purified. In addition, the crystalline 
forms of the invention can be formed from polypeptides 
that are isolated, enriched, or purified. 

By "isolated" in reference to a polypeptide is 
meant a polymer of 6, 12, 18 or more amino acids 
conjugated to each other, including polypeptides that 
are isolated from a natural source or that are 
synthesized. The isolated polypeptides of the present 
invention are unique in the sense that they are not 
found in a pure or separated state in nature. Use of 
the term "isolated" indicates that a naturally occurring 
sequence has been removed from its normal cellular 
environment. Thus, the sequence may be in a cell -free 
solution or placed in a different cellular environment. 
The term does not imply that the sequence is the only 
amino acid chain present, but that it is essentially 
free (about 90 - 95% pure at least) of material 
naturally associated with it. 

By the use of the term "enriched" in reference to a 
polypeptide it is meant that the specific amino acid 
sequence constitutes a significantly higher fraction (2 
- 5 fold) of the total of amino acids present in the 
cells or solution of interest than in normal or diseased 
cells or in the cells from which the sequence was taken. 
This could be caused by a person by preferential 
reduction in the amount of other amino acids present, or 
by a preferential increase in the amount of the specific 
amino acid sequence of interest, or by a combination of 
the two. However, it should be noted that "enriched" 
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does not. imply that there are no other amino acid 
sequences present, just that the relative amount of the 
sequence of interest has been significantly increased. 
The term significant here is used to indicate that the 
5 level of increase is useful to the person making such an 
increase, and generally means an increase relative to 
other amino acids of about at least 2 fold, more 
preferably at least 5 to 10 fold or even more. The term 
also does not imply that there are no amino acids from 

10 other sources. The other source amino acids may, for 
example, comprise amino acids encoded by a yeast or 
bacterial genome, or a cloning vector such as pUC19. 
The term is meant to cover only those situations in 
which a person has intervened to elevate the proportion 

15 of the desired nucleic acid. 

It is also advantageous for some purposes that an 
amino acid sequence be in purified form. The term 
"purified" in reference to a polypeptide does not 
require absolute purity (such as a homogeneous 

20 preparation) ; instead, it represents an indication that 
the sequence is relatively purer than in the natural 
environment (compared to the natural level this level 
should be at least 2-5 fold greater, e.g., in terms of 
tng/ral) . Purification of at least one order of 

2 5 magnitude, preferably two or three orders, and more 

preferably four or five orders of magnitude is expressly 
contemplated. The substance is preferably free of 
contamination at a functionally significant level, for 
example 90%, 95%, or 99% pure. 

30 In a preferred embodiment, the invention features a 

polypeptide corresponding to the catalytic domain of a 
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receptor PTK . The receptor PTK may have a three- 
dimensional structure substantially similar to that of 
the insulin receptor, even though the ammo acid content 
may be different . 
5 In a preferred embodiment, the invention features a 

polypeptide corresponding to the catalytic domain of a 
non-receptor PTK, where the non- insulin receptor 
tyrosine kinase is a cytoplasmic tyrosine kinase. 

In a preferred embodiment, the invention features a 

10 polypeptide corresponding to the catalytic domain of a 

receptor PTK, selected from the group consisting of FGF- 
R, PDGF-R , KDR , CCK4 , MET, TRKA, AXL, TIE , EPH, RYK, 
DDR, ROS, RET, LTK, ROR1 , or MUSK . 

In a preferred embodiment, the invention features a 

15 polypeptide corresponding to the catalytic domain of a 
non-receptor PTK, selected from the group consisting of 
SRC, BRK, BTK, CSK, ABL, ZAP 7 0 , FES, FAK , JAK, or ACK . 

In a preferred embodiment, the invention features a 
polypeptide corresponding to the catalytic domain of a 

2 0 PTK, having the amino acid sequence shown in Table 1 or 
Table 2. 

In another aspect, the invention features a method 
for creating crystalline forms described herein. The 
method may utilize the polypeptides described herein to 
25 form a crystal. The method comprises the steps of: 

(a) mixing a volume of polypeptide solution 
with a reservoir solution; and 

(b) incubating the mixture obtained in step 
(a) over the reservoir solution in a closed container, 

30 under conditions suitable for crystallization. 

These processes are described in detail in the 
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section entitled "Detailed Description of the 
Invention . " 

In another aspect, the invention features a method 
of obtaining FGF receptor tyrosine kinase domain 
5 polypeptide in crystalline form, comprising the steps 

of: (a) mixing a volume of polypeptide solution with an 
equal volume of reservoir solution, where the 
polypeptide solution comprises 1 mg/mL to 60 mg/mL FGF- 
type tyrosine kinase domain protein, 10 mM tc 200 mM 

10 buffering agent, 0 mM to 20 mM dithiothreitol and has a 
pH of about 5.5 to about 7.5, and where the reservoir 
solution comprises 10% to 30% (w/v) polyethylene glycol, 
0.1 M to 0.5 M ammonium sulfate, 0% to 20% (w/v) 
ethylene glycol or glycerol, 10 mM to 200 mM buffering 

15 agent and has a pH of about 5 . 5 to about 7.5; and (b) 
incubating the mixture obtained in step (a) over said 
reservoir solution in a closed container at a 
temperature between 0° and 25°C until crystals form. 

In a preferred embodiment, the invention features a 

20 method of obtaining FGF receptor tyrosine kinase domain 
polypeptide in crystalline form, where the polypeptide 
solution comprises about 10 mg/mL FGF receptor tyrosine 
kinase domain, about 10 mM sodium chloride, about 2 mM 
dithiothreitol, about 10 mM Tris-HCl and has a pH of 

25 about 8; the reservoir buffer comprises about 16% (w/v) 
polyethylene glycol (MW 10000), about 0.3 M ammonium 
sulfate, about 5% ethylene glycol or glycerol, about 100 
mM bis-Tris and has a pH of about 6.5; and the 
temperature is about 4°C. 

30 in another preferred embodiment, the invention 

features a method of obtaining FGF receptor tyrosine 
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kinase domain polypeptide in crystalline form, where the 
polypeptide solution includes a compound such as a 
cofactor, substrate, substrate analog, inhibitor or 
allosteric effector. 
5 In still another preferred embodiment, the 

invention features a method of obtaining FGF receptor 
tyrosine kinase domain polypeptide in crystalline form, 
where the compound is a nucleotide analog, such as a 
non-hydrolyzable analog of ATP, or an indolinone. 
10 Indolinone compounds have the general structural formula 
as described herein. 

In another aspect, the invention features a cDNA 
encoding an FGF receptor tyrosine kinase domain protein, 
where a coding strand of the cDNA has the nucleotide 
15 sequence of SEQ ID NO: 5. 

Another aspect of the invention relates to a method 
of determining three dimensional structures of PTKs with 
unknown structure by utilizing the structural 
coordinates of Table 1, Table 2, Table 3, and Table 4. 

2 0 These methods can relate to homology modeling, molecular 

replacement, and nuclear magnetic resonance methods. 

In a preferred embodiment, the invention relates to 
a method of determining three dimensional structures of 
PTKs with unknown structures by utilizing the 
25 coordinates of Table 1, Table 2, Table 3, or Table 4 in 
conjunction with the amino acid sequences of PTKs. This 
method of homology modeling comprises the steps of: (a) 
aligning the computer representation of an amino acid 
sequence of a PTK with unknown structure with that of a 

3 0 PTK with known structure, where alignment is achieved by 

matching homologous regions of the amino acid sequences; 
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(b) transferring Che computer representation of an amine 
acid structure in the PTK sequence of known structure to 
a computer representation of a structure of the 
corresponding ammo acid in the PTK sequence with 
5 unknown structure; and (c) determining low energy 
conformations of the resulting PTK structure. 

The term "amino acid sequence" describes the order 
of amino acids in the amino acid chain comprising a 
polypeptide corresponding to the catalytic domain of a 
10 PTK . 

The term "aligning" describes matching the 
beginning and the end of two or more amino acid 
sequences- Homologous amino acid sequences are placed 
on top of one another during the alignment process. 
15 The term "homologous* describes amino acids in two 

sequences that are identical or have similar side-chain 
chemical groups (e.g., aliphatic, aromatic, polar, 
negatively charged, or positively charged) . 

The term "corresponding" refers to an amino acid 
20 that is aligned with another in the sequence alignment 
mentioned above. 

The term "determining the low energy conformation" 
describes a process of changing the conformation of the 
PTK structure such that the structure is of low free 
25 energy. The PTK structure may or may not have molecules, 
such as modulators bound to it . 

The term "low free energy" describes a state where 
the molecules are in a stable state as measured by the 
process. A stable state is achieved when favorable 
30 interactions are formed within the complex. 

The term "favorable interactions" refers to 



WO 98/07835 



PCT/US97/ 14885 



2 8 

hydrophobic, aromatic, and ionic forces, and hydrogen 
bonds . 

Another preferred embodiment of the invention 
relates to a method of determining three dimensional 
structures of PTKs with unknown structure. This method 
is accomplished by applying the structural coordinates 
of Table 1, Table 2, Table 3, or Table 4 to an 
incomplete X-ray crys tallographic data set for a PTK. 
The method comprises the steps of: (a) aligning the 
positions of atoms in the unit cell by matching electron 
diffraction data from two crystals, where one data set 
is complete and the other is incomplete; and (b) 
determining a low energy conformation of the resulting 
PTK structure. 

The term "incomplete data set" relates to a X-ray 
crystallographic data set that does not have enough 
information to give rise to a three dimensional 
structure . 

In another preferred embodiment, the invention 
relates to a method of determining three dimensional 
structures of PTKs with unknown structure by applying 
the structural coordinates of Table 1, Table 2, Table 3, 
or Table 4 to nuclear magnetic resonance (NMR) data of a 
PTK. This method comprises the steps of: (a) 
determining the secondary structure of a PTK structure 
using NMR data; and (b) simplifying the assignment of 
through-space interactions of ammo acids. The PTK 
structure may not be complexed with compounds or 
modulators . 

The term "secondary structure" describes tne 
arrangement of ammo acids in a three dimensional 
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structure, such as in a-helix or B-sheet elements. 

The term "through- space interactions'' defines the 
orientation of the secondary structural elements in the 
three dimensional structure and the distances between 
5 amino acids from different portions of the amino acid 
sequence . 

The term "assignment" defines a method of analyzing 
NMR data and identifying which amino acids give rise to 
signals in the NMR spectrum. 

10 In another aspect, the invention features a method 

of identifying potential modulators of PTK function. 
These modulators are identified by docking a computer 
representation of a structure of a compound with a 
computer representation of a cavity formed by the 

15 active-site of a PTK . The computer representation of 
the PTK active- site structure can be defined by 
structural coordinates. 

The term "chemical group" refers to moieties that 
can form hydrogen bonds, hydrophobic, aromatic, or ionic 

20 interactions. 

The term "docking" refers to a process of placing a 
compound in close proximity with a PTK. The term can 
also refer to a process of finding low energy 
conformations of the compound/PTK complex. 

25 a preferred embodiment of the invention is a method 

of identifying potential modulators of PTK function. 
The method involves utilizing the structural coordinates 
or a PTK three dimensional structure. The structural 
coordinates set forth in Table 1, Table 2, Table 3, and 

30 Table 4 can be utilized. The method comprises the steps 
of: (a) removing a computer representation of a PTK 
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structure and docking a computer representation of a 
compound from a computer data base with a computer 
representation of the active-site of the PTK ; (b) 
determining a conformation of the complex with a 
5 favorable geometric fit and favorable complementary 

interactions; and (c) identifying compounds that best 
fit the PTK active-site as potential modulators of PTX 
function. The initial PTK structure may or may not have 
compounds bound to it . 
10 The term "favorable geometric fit" refers to a 

conformation of the compound - PTK complex where the 
surface area of the compound is in close proximity with 
the surface area of the active-site without forming 
unfavorable interactions. Unfavorable interactions can 
15 be st eric hindrances between atoms in the compound and 
atoms in the PTK active-site. 

The term "favorable complementary interactions" 
relates to hydrophobic, aromatic, ionic, and hydrogen 
bond donating, and hydrogen bond accepting forces formed 
20 between the compound and the PTK active-site. 

The term "potential" qualifies the term "modulator 
of PTK function" because the potential modulator or PTK 
function has not yet been tested for activity in vitro 
or in vivo. 

25 The term "best fit" describes compounds that 

complexed the most surface area in the complex and/or 
form the most favorable complementary interactions with 
the PTK in the screen in a given experiment. 

Another preferred embodiment of the invention is a 

30 method of identifying potential modulators of PTK 
function. The method involves utilizing a three 
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dimensional structure of a PTK, with or without 
compounds bound to it . The method comprises the steps 
of: (a) modifying a computer representation of a PTK 
having one or more compounds bound to it, where the 
5 computer representations of the compound or compounds 
and PTK are defined by structural coordinates; (b) 
determining a conformation of the complex with a 
favorable geometric fit and favorable complementary 
interactions; and (c) identifying the compounds that 
10 best fit the PTK active-site as potential modulators of 
PTK function. 

The term "modifying" relates to deleting a chemical 
group or groups or adding a chemical group or groups. 
Computer representations of the chemical groups can be 

15 selected from a computer data base. 

Yet another preferred embodiment of the invention 
is a method of identifying potential modulators of PTK 
function by operating modulator construction or 
modulator searching computer programs on the compounds 

2 0 complexed with the PTK . The method comprises the steps 
of: (a) removing a computer representation of one or 
more compounds complexed with a PTK; and (b) searching a 
data base for compounds similar to the removed compounds 
using a compound searching computer program, or 

25 replacing portions of the compounds complexed with the 
PTK with similar chemical structures from a data base 
using a compound construction computer program, where 
the representations of the compounds are defined by 
structural coordinates. 

30 The term "operating" as used herein refers to 

utilizing the three-dimensional conformation of 
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molecules defined by the processes aescribea herein in 
various computer programs. 

The term "similar compound" refers to a compound in 
a computer data base that has a similar geometric 
5 structure as compounds that can bind to a PTK. The 

similar compound can also have similar chemical groups 
as the compounds that are either bound to the PTK or 
once bound to the PTK . The similar chemical groups can 
form complementary interactions with the PTK. 
10 The term "compound searching computer program" 

describes a computer program that searches computer 
representations of compounds from a computer data base 
that have similar three dimensional structures and 
similar chemical groups as a compound of interest. The 
15 compound of interest is preferably an indolinone 
compound. 

The term "similar chemical structures'' refers to 
chemical groups that share similar geometry as portions 
of the compounds in complex with the PTK or compounds 
20 removed from the PTK structure. Similar chemical 

structures can also refer to chemical groups that may 
form similar complementary interactions as portions of 
the compounds in complex with the PTK or compounds 
removed from the PTK structure. 
25 The term "replacing structures" refers to removing 

a portion of the compounds in complex with the PTK or 
compounds removed from the PTK structure and connecting 
the broken bonds to a similar chemical structure. 

The term "compound construction computer program" 
describes a computer program that replaces computer 
representations of chemical groups in a compound with 



30 
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groups from a computer data base. The compound is 
preferably an mdolmone compound. 

The term "similar three dimensional structure" 
describes two molecules with nearly identical shape and 
5 volume. 

In another preferred embodiment of the invention, 
the PTK structures used in the modulator design or 
identification method of the invention are defined by 
the structural coordinates of Table 1, Table 2, Table 3, 

10 or Table 4 . 

The methods for using the crystalline forms and 
three dimensional structures of the invention can relate 
to a broad range of protein kinases. Thus, in preferred 
embodiments, the invention relates to a receptor PTK. 

15 The receptor PTK can be selected form the group 

consisting of FGF-R, PDGF-R, FLK, CCK4 , MET , TRKA, AXL, 
TIE, EPH, RYK, DDR, ROS, RET , LTK, R0R1 , and MUSK. The 
PTK may also exist as a non- receptor PTK. The non- 
receptor PTK can be selected from the group consisting 

2 0 of SRC, BRK, BTK, CSK, ABL, ZAP 70 , FES, FAK, JAK, and 
ACK . 

In another aspect, the invention features a 
potential modulator of PTK function identified by 
methods disclosed in the invention. 

25 A preferred embodiment of the invention is that the 

potential modulator of PTK function is an oxindolinone 
or a thiolindolinone of formula I or II disclosed above. 

Another aspect of the invention is a method for 
synthesizing a potential modulator of PTK function or 

30 its pharmaceutical^ acceptable salts, isomers, 

metabolites, esters, amides, or prodrugs by a standard 
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synthetic method known in the art. Syntnetic orocedures 
are discussed below. 

In another aspect, the invention features a method 
of identifying a potential modulator of PTK function as 
a modulator of PTK function. The method comprises the 
steps of: (a) administering a potential modulator of PTK 
function to cells; (b) comparing the level of PTK 
phosphorylation between cells not administered the 
potential modulator and cells administered the potential 
modulator; and (c) identifying tne potential modulator 
as a modulator of PTK function based on the difference 
in the level of PTK phosphorylation. 

The term "cells" refers to any type of cells either 
primary or cultured. Primary cells can be extracted 
directly from an organism while cultured cells rapidly 
divide and can be cultured in many successive rounds. 
Cells can be grown in a variety of containers including, 
but not limited to flasks, dishes, and well plates. 

The term "administer" refers to a method of 
delivering a compound to cells. The compound can be 
prepared using a carrier such as dimethyl sulfoxide 
(DMSO) in an aqueous solution. The aqueous solution 
comprising the compound, also termed an "aqueous 
preparation", can be simply mixed into the medium 
bathing the layer of cells or microin j ected into the 
cells themselves. The compounds may be administered to 
the cells using a suitable buffered solution. 

The term "suitable buffered solution" refers to an 
aqueous preparation of the compound that comprises a 
salt that can control the pH of the solution at low 
concentrations. Because the salt exists at low 
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concentrations, the salt preferably does not alter the 
function of the cells. 

The term w PTK phosphorylation" refers to the 
presence of phosphate on the PTK. Phosphates on PTKs 
5 can be identified by antibodies that bind them 
specifically with high affinity. 

In another aspect, the invention features a method 
of identifying a potential modulator of PTK function as 
a modulator of PTK function. The method comprises the 
10 steps of: (a) administering a potential modulator of PTK 
function to cells; (b) comparing the level of cell 
growth between cells not administered the potential 
modulator and cells administered the potential 
modulator; and (c) identifying the potential modulator 
15 as a modulator of PTK function based on the difference 
in cell growth. 

The term "cell growth" refers to the rate at which 
a group of cells divides. Ceil division rates can be 
readily measured by methods utilized by those skilled in 

20 the art. 

Another aspect of the invention features a method 
of diagnosing a disease by identifying cells harboring a 
PTK with inappropriate activity. The method comprises 
the steps of: (a) administering a modulator of PTK 

25 function to cells; (b) comparing the rate of ceil growth 
between cells not administered the modulator and cells 
administered the modulator; and (c) diagnosing a disease 
by characterizing cells harboring a PTK with 
inappropriate activity from the effect of the modulator 

30 on the difference in the rate of cell growth. The 
modulator can be identified by the methods of the 
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invention . 

The term "inappropriate activity" refers to a PTK 
that regulates a step in a signal transduction process 
at a higher or lower rate than normal cells. 

S Aberrations in the rate of signal transduction can be 
caused by alterations in the stimulation of a receptor 
PTK by a growth factor, alterations in the activity of 
PTK-specific phosphatase, over -expression of a PTK in a 
cell, or mutations in the catalytic region of the PTK 

0 itself. 

The term "signal transduction process" describes 
the steps in a cascade of events where an extracellular 
signal is transmitted into an intracellular signal. 

The term *PTK-specif ic phosphatase" describes an 
enzyme that dephosphorylates a particular PTK and 
thereby regulates that PTK' s activity. 

Another aspect of the invention is a method of 
treating a disease associated with a PTK with 
inappropriate activity in a cellular organism, where the 
method comprises the steps of: (a) administering the 
modulator of PTK function to the organism, where the 
modulator is in an acceptable pharmaceutical 
preparation; and (b) activating or inhibiting the PTK 
function to treat the disease. 

The term "organism" relates to any living being 
comprised of at least one cell. An organism can be as 
simple as one eukaryotic cell or as complex as a mammal. 

The term "administering", in reference to an 
organism, refers to a method of introducing the compound 
to the organism. The compound can be administered when 
the cells or tissues of the organism exist within the 
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organism or outside ot the organism. Cells existing 
outside the organism can be maintained or grown in cell 
culture dishes. For cells harbored within the organism, 
many techniques exist in the art to administer 
5 compounds, including (but not limited to) oral, 

parenteral, dermal, and injection applications. For 
cells outside of the patient, multiple techniques exist 
in the art to administer the compounds, including (but 
not limited to) cell microinjection techniques, 

10 transformation techniques, and carrier techniques. 

The term "pharmaceutical ly acceptable composition" 
refers to a preparation comprising the modulator of PTK 
activity. The composition is acceptable if it does not 
appreciably cause irritations to the organism 

15 administered the compound. 

Preferred embodiments of the of the invention are 
that the PTK is a receptor PTK selected from the group 
consisting of FGF-R, PDGF-R, FLK-1, CCK4 , MET, TRKA, 
AXL, TIE, EPH , RYK, DDR, ROS, RET, LTK, ROR1 , and MUSK. 

20 Other preferred embodiments of the invention are that 
the PTK is a non-receptor PTK selected from the group 
consisting of SRC, BRK, BTK, CSK, ABL, ZAP70, FES, FAK, 
JAK, and ACK. 

The summary of the invention described above is 

25 non- limiting and other features and advantages of the 
invention will be apparent from the following detailed 
description, and from the claims. 



30 



RPTF1F DESCRIPTIO N OF THE FIGURES 
FIG. 1 provides a ribbon diagram of the structure 
of FGFR1 showing the side chains of tyrosines Tyr-653 
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and Tyr-654 and the a helical (aC, aD f aE, c*EF , aF-al:, 
3 strand (Bl-(35, 37, 38), nucleon ide-binding loop, 
catalytic loop, activation loop and kinase insert 
regions of the molecule. The termini are denoted by N 
5 and C. The loop between 32 and 33 is disordered, 
indicated by a break in the chain in this region. 

FIG. 2 provides a stereo view of a C a trace of FGFR1 
shown in the same orientation as FIG. 1, with every 
tenth amino acid residue marked with a filled circle and 
10 every twentieth amino acid residue labeled with a 
residue number. 

FIG. 3 provides a structure-based sequence 
alignment of human fibroblast growth factor receptor 1 
(FGFR1), human fibroblast growth factor receptor 2 
15 (FGFR2) , human fibroblast growth factor receptor 3 
{ FGFR3 ) , human fibroblast growth factor receptor 4 
( FGFR4 ) , a D . malanogaster homo log ( DFGFR1 ) , a C. 
elegans homolog (EGL-15) and insulin receptor tyrosine 
kinase (IRK) . 

20 FIGS. 4A and 4B provide ribbon diagrams of the 

N-terminal lobes (4A) and C-terminal lobes (4B) of FGFR1 
and IRK in which the C B atoms of the 3 sheets (4A) or a- 
helices (4B) of the two proteins have been superimposed . 
FIG. 5 illustrates the side-chain positions of the 

25 tyrosine autophosphorylation sites of FGFR1 on the 
backbone representation of FGFR1. 

FIGS. 6A and 6B are amino acid sequence alignments 
of the catalytic domains of PTKs, including receptor and 
non- receptor type PTKs. FIG. 6A depicts one 

30 representative member from each of the eighteen 

subfamilies of receptor tyrosine kinases. FIG . 6B 
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depicts one representative member from each of the 
subfamilies of cytoplasmic tyrosine kinases. In FIGS. 
6A and 6B highly conserved residues are boxed. The 
position of the glycine-rich domain, kinase insert, 
5 catalytic loop, and activation loop are indicated. The 
numbering is for human FGF - receptor . 

BRIEF DESCRIPTION OF THE CRYSTALLQGRAPHIC ATOMIC 
STRUCTURAL COORDINATES 

10 The crystallographic structural coordinates are 

located at the end of the section entitled "Examples" 
and before the claims. Three sets of coordinates can be 
found in the Protein Data Bank under accession names 
1FGK, 1AGW, and 1FGI . The 1FGK coordinates correspond 

15 to those listed in Table 1, the 1AGW coordinates 

correspond to those listed in Table 4, and thelFGI 
coodinates correspond to those listed in Table 3. The 
1AGW and 1FGI coordinate sets will be publically 
available in March 1998. 

2 0 Table 1 provides the atomic structure coordinates 

of native FGFR1 crystals of the invention as determined 
by X-ray crystallography; and 

Table" 2 provides the atomic structure coordinates 
of FGFR1 :AMP-PCP co-crystals of the invention as 

2 5 determined by X-ray crystallography. 

Table 3 lists crystallographic coordinates defining 
the three dimensional structure of FGF-R1 compiexed with 
3- [ (3 - (2-carboxyethyl) -4 -methylpyrrol - 5 - yl ) methylene] -2- 
indolinone. The columns (from left to right) are 

30 descriptions of the atoms by number and type, amino acid 
and number containing the atom, the x coordinate, y 
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coordinate, z coordinate, bond connectivity, and 
temperature factor. All of these parameters are well 
defined in the art. 

Table 4 is a file of crystallographic coordinates 
defining the three dimensional structure of FGF-R1 
complexed with 3 - [4 - (4 - f ormylpiperazine - 1 -yl ) 
benzylidenyl) -2 - indolinone . The columns are as 
described in Table 3. 

DETAILED DESCRIPTION OF THE INVENTION 
The present invention is directed to the design and 
identification of modulators of protein tyrosine kinase 
function that are PTK subfamily specific, non- 
hydrolyzable under acidic conditions, and highly 
bioavailable . The three dimensional structures of a PTK 
optionally complexed with compounds can facilitate 
design and identification of modulators of PTK function. 

Protein tyrosine kinases (PTKs) comprise a large 
and diverse class of enzymes. Schlessinger and Ullrich, 
1992, Neuron 9: 383-391. The PTK family is subdivided 
into members that are receptors and those that are non- 
receptors. The PTK receptor family contains multiple 
subfamilies, one of which is the fibroblast growth 
factor receptor (FGF-R) PTK which is a molecule 
implicated in regulating angiogenesis a well as cellular 
proliferation and differentiation. Givol and Yayon, 
1992, FASEB J. 6 (15): 3362-3369. 

FGF-R1 can mediates cellular functions by its role 
in one or more cellular signal transduction processes. 
Cellular signal transduction processes comprise multiple 
steps that convert an extracellular signal into an 
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intracellular signal. 

Recepcor PTK mediated signal transduction is 
initiated by binding a specific extracellular ligand, 
followed by receptor dimer i zat ion , and subsequent 
5 autophosphorylation of the receptor PTK . The phosphate 
groups are binding sites for intracellular signal 
transduction molecules which leads to the formation of 
protein complexes at the cell membrane. These complexes 
facilitate an appropriate cellular effect (e.g., cell 

10 division, metabolic effects to the extracellular 

microenvironment) in response to the ligand that began 
the cascade of events. 

Receptor PTKs function as binding sites for several 
intracellular proteins Intracellular PTK binding 

15 proteins are divided into two principal groups: (1) 
those which harbor a catalytic domain; and (2) those 
which lack such a domain but serve as adapters and 
associate with catalyt ically active molecules. Songyang 
et al., 1993, Cell 72:767-778. SH2 ( src homology) 

20 domains are common adaptors found in proteins which 
directly bind to the receptor PTK . SH2 domains are 
harbored by PTK binding proteins of both groups 
mentioned above. Fantl et al . , 1992, Cell £9:413-423; 
Songyang et al . , 1994, Mol . Cell. Biol. 14:2777-2785); 

25 Songyang et al., 1993, Cell 72:767-778; and Koch et al . , 
1991, Science 252:668-678. 

The specificity of the interactions between 
receptor PTKs and the SH2 domains of their binding 
proteins is determined by the amino acid residues 

30 immediately surrounding the phosphorylated tyrosine 

residue. Differences in the binding affinities of SH2 
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domains is correlated with the observed differences m 
substrate phosphorylation profiles of downstream 
molecules m the signal transduction process. Songyang 
et al. f 1993, Cell 72:761-118 . These observations 
5 suggest that the function of each receptor PTK is 

determined not only by its pattern of expression and 
ligand availability but also by the array of downstream 
signal transduction pathways that are activated by a 
particular receptor. Thus, PTKs provide a controlling 
10 regulatory role in signal transduction processes as a 
consequence of autophosphorylation . 

PTK-mediated signal transduction regulates cell 
proliferative, differentiation, and metabolic responses 
in cells. Theref-ore, inappropriate PTK activity can 
15 result in a wide array of disorders and diseases. These 
disorders, which are described below, may be treated by 
the modulators of PTK function designed or identified by 
the methods disclosed herein. 

The present invention also relates to crystalline 
20 polypeptides corresponding to the catalytic domain of 
receptor tyrosine kinases. Such tyrosine kinases 
include receptors of a class that are not covalently 
cross -linked but are understood to undergo ligand- 
induced dimerization, as well as cytoplasmic tyrosine 
25 kinases. Preferably, the crystalline catalytic domains 
are of sufficient quality to allow for the determination 
of a three-dimensional X-ray diffraction structure to a 
resolution of about 1.5 A to about 2.5 A. The invention 
also relates to methods for preparing and crystallizing 
30 the polypeptides. The polypeptides themselves, as well 
as information derived from their crystal structures can 
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oe used to analyze ana modify tyrosine kinase activity 
as well as to identify compounds than interact with the 
catalytic domain. 

The polypeptides of the invention are designed on 
5 the basis of the structure of a region in the 

cytoplasmic domain of the receptor tyrosine kinase that 
contains the catalytic domain. By way of illustration, 
FIG. 6A shows the amino acid sequence alignment of the 
catalytic domains of eighteen human receptor tyrosine 

10 kinases; one representative member from each of the 
eighteen subfamilies is shown. FIG. 6B shows the 
alignment for cytoplasmic kinases. The applicants have 
discovered and determined the boundaries of the domain 
required for crystallization of the resulting 

15 polypeptide. Surprisingly, these boundaries differ from 
that required for catalytic activity. For example, 
referring to FIG. 6A, the domain required for catalytic 
activity is generally believed to span about 7 amino 
acid residues upstream of the first glycine (FIG. 6A 

20 residue number 485) of the N-terminal glycine-rich 

region through about 10 residues beyond the C- terminal 
conserved arginine (FIG. 6A, residue number 744) . 
However, the additional sequence upstream of the N- 
terminal glycine-rich region and downstream of the C- 

25 terminal conserved arginine can be required for 

crystallization. In particular, at least about 20 amino 
acid residues (+/- 5 amino acid residues) upstream of 
the first glycine ( i.e. , FIG. 6A, residue number 485) in 
the conserved glycine-rich region of the catalytic 

30 domain, and at least about 17 amino acid residues ( + /- 5 
amino acid residues) downstream of the conserved 
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arginine ( i.e. , FIG. 6A, residue number 744) located at 
the C-termmai ooundary of the catalytic domain can be 
required to engineer a polypeptide suitable for 
crystallization . 
5 In those situations where the resulting polypeptide 

contains cysteine residues that interfere with 
crystallization ( e.g. . cysteine residue numbers 488 and 
584 in the FGF-R1 sequence shown in FIG . 6A) , such 
cysteine residues can be substituted with an appropriate 

10 amino acid that does not readily form covalent bonds 
with other amino acid residues under crystallization 
conditions; e,g, , by substituting the cysteine with Ala, 
Ser or Gly. Any cysteine located in a non-helical or 
non-0- stranded segment, based on secondary structure 

15 assignments, are good candidates for replacement. For 
example, cysteines located in regions corresponding to 
the glycine-rich-loop, the kinase insert, the 
juxtamembrane region or the activation loop are prime 
candidates for replacement. However, substitutions of 

20 cysteine residues that are conserved among the kinases 
(e-g- , FIG . 6A at positions 725 and 736) are preferably 
avoided. 

I . PTK Associated Diseases 

25 Blood vessel proliferative disorders refer to 

angiogenic and vasculogenic disorders generally 
resulting in abnormal proliferation of blood vessels. 
The formation and spreading of blood vessels play 
important roles in a variety of physiological processes 

30 such as embryonic development, corpus luteum formation, 
wound healing and organ regeneration. They also play a 
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pivotal role in cancer development.. Other examples of 
blood vessel proliferation disorders include arthritis, 
where new capillary blood vessels invade the joint and 
destroy cartilage, and ocular diseases, like diabetic 
5 retinopathy, where new capillaries in the retina invade 
the vitreous, bleed and cause blindness. Conversely, 
disorders related to the shrinkage, contraction or 
closing of blood vessels are implicated in such diseases 
as restenosis. 

10 Fibrotic disorders refer to the abnormal formation 

of extracellular matrix. Examples of fibrotic disorders 
include hepatic cirrhosis and mesangial cell 
proliferative disorders. Hepatic cirrhosis is 
characterized by the increase in extracellular matrix 

15 constituents resulting in the formation of a hepatic 
scar. Hepatic cirrhosis can cause diseases such as 
cirrhosis of the liver. An increased extracellular 
matrix resulting in a hepatic scar can also be caused by 
viral infection such as hepatitis. 

20 Mesangial cell proliferative disorders refer to 

disorders brought about by abnormal proliferation of 
mesangial cells. Mesangial proliferative disorders 
include various human renal diseases, such as 
glomerulonephritis, diabetic nephropathy, malignant 

25 nephrosclerosis, thrombotic microangiopathy syndromes, 
transplant rejection, and glomerulopathies. The PDGF-R 
has been implicated in the maintenance of mesangial cell 
proliferation. Floege et al . , 1993, Kidney 
International £2.:47S-54S. 

30 PTKs are directly associated with the cell 

proliferative disorders described above. For example, 
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some members of the receptor PTK family have been 
associated with the development of cancer. Some of 
these receptors, like EGFR (Tuzi et al . , 1991, Br. J. 
Cancer 63:227-233; Torp et al . , 1992, APMIS 100:713- 
5 719) HER2/neu (Slamon et al . , 1989, Science 244:707-712) 
and PDGF-R (Kumabe et al . , 1992, Oncogene 7:627-633) are 
over-expressed in many tumors and/or persistently 
activated by autocrine loops. In fact, PTK over- 
expression (Akbasak and Suner- Akbasak et al. t 1992, J. 

10 Neurol, Sex . 111:119-133; Dickson et al . , 1992, Cancer 
Treatment Res. 51:249-273; Korc et al . , 1992, J. Clin. 
Invest. .90:1352-1360) and autocrine loop stimulation 
(Lee and Donoghue , 1992, J". Cell. Biol. 110:1057-1070; 
Korc et al . , supra; Akbasak and Suner-Akbasak et al . , 

15 supra) account for the most common and severe cancers. 
For example, EGFR is associated with squamous cell 
carcinoma, astrocytoma, glioblastoma, head and neck 
cancer, lung cancer and bladder cancer. HER2 is 
associated with breast, ovarian, gastric, lung, pancreas 

20 and bladder cancer. PDGF-R is associated with 

glioblastoma, lung, ovarian, melanoma and prostate 
cancer. The receptor PTK c-met is generally associated 
with hepatocarcmogenesis and thus hepatocellular 
carcinoma. Additionally, c-met is linked to malignant 

25 tumor formation. More specifically, c-met has been 
associated with, among other cancers, colorectal, 
thyroid, pancreatic and gastric carcinoma, leukemia and 
lymphoma. Additionally, over-expression of the c-met 
gene has been detected in patients with Hodgkins 

30 disease, Burkitts disease, and the lymphoma cell line. 

The IGF- I receptor PTK , in addition to being 
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implicated in nutritional support and in type- II 
diabetes, is also associated with several types of 
cancers. For example, IGF-I has been implicated as an 
autocrine growth stimulator for several tumor types, 
5 e.g. human breast cancer carcinoma cells (Arteaga et 

al . , 1989, J . Clin. Invest. 54:1418-1423) and small lung 
tumor cells (Macauley et al . , 1990, Cancer Res. 50:2511- 
2517). In addition, IGF-I, integrally involved in the 
normal growth and differentiation of the nervous system, 

10 appears to be an autocrine stimulator of human gliomas. 
Sandberg-Nordqvist et al., 1993, Cancer Res . 53:2475- 
2478. The importance of the IGF-IR and its modulators 
in cell proliferation is further supported by the fact 
that many cell types in culture (fibroblasts, epithelial 

15 cells, smooth muscle cells, T - lymphocytes , myeloid 

cells, chondrocytes, osteoblasts, the stem cells of the 
bone marrow) are stimulated to grow by IGF-I. Goldring 
and Goldring, 1991, Eukaryotic Gene Expression 1:301- 
326. In a series of recent publications suggest that 

20 IGF-IR plays a central role in the mechanisms of 

transformation and, as such, could be a preferred target 
for therapeutic interventions for a broad spectrum of 
human malignancies . Baserga, 1995, Cancer Res. 55:249- 
252; Baserga, 1994, Cell 79:927-930; Coppola et al . , 

25 1994, Mol. Cell, Biol. 14:4588-4 595. 

The association between abnormalities in receptor 
PTKs and disease are not restricted to cancer, however. 
For example, receptor PTKs are associated with metabolic 
diseases like psoriasis, diabetes mellitus, wound 

30 healing, inflammation, and neurodegenerative diseases. 

EGF-R is indicated in corneal and dermal wound healing. 
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Defects in Insulin-R and IGF-IR are indicated in type- II 
diabetes mellitus. A more complete correlation between 
specific receptor PTKs and their therapeutic indications 
is set forth in Plowman et al . , 1994, DN&P 7:334 - 339. 

Non-receptor PTKs , including src r abl , fps, yes, 
fyn, lyn, lck, blk, hck, fgr, yrk (reviewed by Bolen et 
al., 1992, FASEB J. 6 : 34 03 - 3 4 0 9 ) , are involved in the 
proliferative and metabolic signal transduction pathways 
also associated with receptor PTKs . Therefore, the 
present invention is also directed towards designing 
modulators against this class of PTKs. For example, 
mutated arc (v-src) is an oncoprotein (pp60 V 9rc ) in 
chicken. Moreover, its cellular homolog, the proto- 
oncogene pp6 0 c ' arc transmits oncogenic signals of many 
receptors. For example, over-expression of EGF-R or 
HER2/ neu in tumors leads to the constitutive activation 
of pp60 c_src , which is characteristic of the malignant 
cell but absent in the normal cell. On the other hand, 
mice deficient for the expression of c-src exhibit an 
osteopetrotic phenotype, indicating a key participation 
of c-src in osteoclast function and a possible 
involvement in related disorders. Similarly, Zap 70 is 
implicated in T-cell signaling. Both receptor PTKs and 
non- receptor PTKs are connected to hyperimmune 
disorders . 

The instant invention is directed in part towards 
designing modulators of PTK function that could 
indirectly kill tumors by cutting off their source of 
sustenance. Normal vasculogenesis and angiogenesis play 
important roles in a variety of physiological processes 
such as embryonic development, wound healing, organ 
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regeneration and female reproductive processes such as 
follicle development in the corpus luteum during 
ovulation and placental growth after pregnancy. Foikman 
and Shmg, 1992, J. Biological Chem. 267:10931-34. 
However, many diseases are driven by persistent 
unregulated or inappropriate angiogenesis . For example, 
in arthritis, new capillary blood vessels invade the 
joint and destroy the cartilage. In diabetes, new 
capillaries in the retina invade the vitreous, bleed and 
cause blindness. Foikman, 1987, in: Congress of 
Thrombosis and Haemostasia (Verstraete, et. al , eds . ) , 
Leuven University Press, Leuven, pp. 583-596. Ocular 
neovascularization is the most common cause of blindness 
and dominates approximately twenty (20) eye diseases. 

Moreover, vasculogenesis and/or angiogenesis can be 
associated with the growth of malignant solid tumors and 
metastasis. A tumor must continuously stimulate the 
growth of new capillary blood vessels for the tumor 
itself to grow. Furthermore, the new blood vessels 
embedded in a tumor provide a gateway for tumor cells to 
enter the circulation and to metastasize to distant 
sites in the body. Foikman, 1990, J. Natl. Cancer Inst. 
82:1-6; Klagsbrunn and Soker, 1993, Current Biology 
3: 699-702; Foikman, 1991, J. Natl., Cancer Inst. 82:4-6; 
Weidner et al . , 1991, New Engl . J. Med. 324:1-5. 

Several polypeptides with in vitro endothelial cell 
growth promoting activity have been identified. 
Examples include acidic and basic fibroblastic growth 
factor (aFGF, 3FGF) , vascular endothelial growth factor 
(VEGF) and placental growth factor. Unlike aFGF and 
3FGF, VEGF has recently been reported to be an 
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endothelial cell specific mitogen. Ferrara and HenzeJ, 
1989, Biochem. Biophys. Res. Comm. 151:851-858; Vaisman 
et al., 1990, J. Biol. Chern. 265:19461-19566. 

Thus, identifying the specific receptors that bind 
FGF or VEGF is important for understanding endothelial 
cell proliferation regulation. Two structurally related 
receptor PTKs that bind VEGF with high affinity are 
identified: the flt-1 receptor (Shibuya et al . , 1990, 
Oncogene 5:519-524; De Vries et al . , 1992, Science 
255:989-991) and the KDR/FLK-1 receptor, discussed in 
the U.S. Patent Application No. 08/193,829. In 
addition, a receptor that binds aFGF and 3 FGF is 
identified. Jaye et al . , 1992, Biochem. Biophys. Acta 
1135:185-199). Consequently, these receptor PTKs most 
likely regulate endothelial cell proliferation. 

FGFRs play important roles in angiogenesis , wound 
healing, embryonic development, and malignant 
transformation. Basilico and Moscatelli, 1992, Adv. 
Cancer Res. 59:115-165. Four mammalian FGFR (FGFR1-4) 
have been described and additional diversity is 
generated by alternative RNA splicing within the 
extracellular domains. Jaye et al . , 1992, Biochem. 
Biophys. Acta 1135:185-199. Like other receptor PTKs , 
dimerization of FGF receptors is essential for their 
activation. Soluble or cell surf ace -bound heparin 
sulfate proteoglycans act in concert with FGF to induce 
dimerization ( Schlessinger et al . , L995, Cell 03:357- 
360}, which leads to autophosphorylat ion of specific 
tyrosine residues in the cytoplasmic domain. Mohammadi 
et al., 1996, Mol . Cell Biol. 15:977-989. 

Mutations in three human FGF receptor genes, FGFRl , 
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FGF R2 , and FGFR3 , have been implicated m a variety of 
human genetic skeletal disorders. Mutations in FGFR1 
and FGFR2 result in the premature fusion of the flat 
bones of the skull and cause the craniosynostosis 
5 syndromes, such as Apert (FGFR2) (Wilkie et al . , 1994, 
Wat. Genet. 6:269-274) , Pfeiffer (FGFR1 and FGFR2 ) 
(Muenke et al . , 1994, Nat. Genet. 8:269-274), 
Jackson-Weiss (FGFR2) (Jabs et al . , 1994, Nat. Genet. 
8:275-279) and Crouzon (FGFR2) (Jabs et al . , 1994, Nat. 

10 Genet. 5:275-279) syndromes. In contrast mutations in 
FGFR3 are implicated in long bone disorders and cause 
several clinically related forms of dwarfism including 
achondroplasia (Shianget al . , 1994, Cell 75:335-342), 
hypochondroplasia (Bellus et al . , 1995, Nat. Genet. 

15 10:357-359) and the neonatal lethal thanatophoric 

dysplasia (Tavormina et al., 1995, Nat. Genet. 9:321- 
328) . It has been shown that these mutations lead to 
constitutive activation of the tyrosine kinase activity 
of FGFR3 (Webster et al . , 1996, EMBO J. 15:520-527). 

20 Furthermore gene - target ing experiments in mice have 

revealed an essential role for FGFR3 in developmental 
bone formation {Deng et al . , 1996, Cell 54:911-921). 

Another major role proposed for FGFs in vivo is the 
induction of angiogenesis (Folkman and Klagsbrun, 1987, 

25 Science 236:442) . Therefore, inappropriate expression 
of FGFs or of their receptors or aberrant function of 
the tyrosine kinase activity could contribute to several 
human angiogenic pathologies such as diabetic 
retinopathy, rheumatoid arthritis, atherosclerosis and 

30 tumor neovascularization (Klagsbrun and Edelrnan, 1989, 
Arteriosclerosis 9:269) . Moreover, FGFs are thought to 
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be involved in malignant transformation. Indeed, 
genes coding for the three FGF homoiogues int-2, FGF-5 
and hst-l/K-fgf were originally isolated as oncogenes. 
Furthermore, the cDNA encoding FGFR1 and FGFR2 are 
amplified in a population of breast cancers (Adnane et 
al., 1991, Oncogene 6: 659-663). Over -expression of FGF 
receptors has been also detected in human pancreatic 
cancers, astrocytomas, salivary gland adenosarcomas , 
Kaposi sarcomas, ovarian cancers and prostate cancers. 

Evidence, such as the disclosure set forth in 
copending U.S. Application Serial No. 08/193,829, 
strongly suggests that VEGF is not only responsible for 
endothelial cell proliferation, but also is a prime 
regulator of normal and pathological angiogenesis . See 
generally, Klagsburn and Soker, 1993, Current Biology 
3:699-702; Houck et al . # 1992, J Biol. Chem. 
267:26031-26037 . Moreover, it has been shown that 
KDR/FLK-1 and flt-1 are abundantly expressed in the 
proliferating endothelial cells of a growing tumor, but 
not in the surrounding quiescent endothelial cells. 
Plate et al . , 1992, Nature 355:845-848; Shweiki et al . , 
1992, Nature 359:843-845. 

The invention is directed to designing and 
identifying modulators of receptor and non-receptor PTK 
functions that could modify the inappropriate activity 
of a PTK involved with a clinical disorder. The 
rational design and identification of modulators of PTK 
functions can be accomplished by utilizing the 
structural coordinates that define a PTK three 
dimensional structure . 
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I! . Modulators of PTK functions as Therapeuti cs for 
Pi sease 

As a consequence of the disorders discussed above, 
scientists in the biomedical community are searching for 
5 modulators of PTK functions that down- regulate signal 

transduction pathways associated with inappropriate PTK 
activity . 

In particular, small molecule modulators of PTK 
functions are sought as some can traverse the cell 

10 membrane and do not hydrolyze in acidic environments. 
Some compounds have already been discovered. For 
example, bis monocyclic, bicyclic or heterocyclic aryl 
compounds ( PCT WO 92/20642), vinylene-azaindole 
derivatives ( PCT WO 94/14808) l-cyclopropyl-4 -pyridyl- 

15 qumolones (U.S. Patent No. 5,330,992), styryl compounds 
(U.S. Patent No. 5,217,999), styryl - subst i tuted pyridyl 
compounds (U.S. Patent No. 5,302,606), certain 
quinazoline derivatives (EP Application No. 0 566 266 
Al), seleoindoles and selenides (PCT WO 94/03427), 

20 tricyclic polyhydroxyl ic compounds (PCT WO 92/21660) , 
and benzylphosphonic acid compounds (PCT WO 91/15495) 
are described as PTK inhibitors. 

Although some modulators of PTK function are known, 
many of these are not specific for PTK subfamilies and 

25 will therefore cause multiple side-effects as 
therapeutics. Compounds of the oxindolinone / 
thiolindolinone family, however, are specific for the 
FGF receptor subfamily (U.S. Patent Application Serial 
No. 08/702,232, filed August 23, 1996, invented by Tang 

30 et al., entitled "Indolinone Combinatorial Libraries and 
Related Products and Methods for the Treatment of 
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Disease," Attorney Docket No. 221/167). In addition, 
compounds of the oxi ndol inone/ thiol indcl inone family are 
non-hydrolyzable in acidic conditions and can be highly 
bioavailable . 

5 The invention provides information regarding the 

specific interactions between a ?TK and compounds of the 
oxmdolinone/thiolindolinone family. Although the use 
of X-ray crystallography has provided three dimensional 
structures of other PTKs, the PTKs in these structures 

10 are not complexed with PTK subfamily specific, 
hydrolysis resistant, highly bioavailable small 
molecules. The X-ray crystallography techniques used in 
the current invention resolve interactions between a PTK 
and compounds in complex with it at the atomic level, 

15 which provides detailed information regarding the 

orientation of chemical groups defining an effective 
modulator of PTK function. 

Ill . Crystall ine Tyrosine Kinases 

2 0 Crystalline PTKs of the invention include native 

crystals, derivative crystals and co-crystals. The 
native crystals of the invention generally comprise 
substantially pure polypeptides corresponding to the 
tyrosine kinase domain in crystalline form. 

2 5 It is to be understood that the crystalline 

tyrosine kinase domains of the invention are not limited 
to naturally occurring or native tyrosine kinase 
domains. Indeed, the crystals of the invention include 
mutants of native tyrosine kinase domains. Mutants of 

30 native tyrosine kinase domains are obtained by replacing 
at least one amino acid residue in a native tyrosine 
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kinase domain with a different: amino acid residue, or by 
adding or deleting amino acid residues within the native 
polypeptide or at the N- or C-terminus of the native 
polypeptide, and have substantially the same three- 
S dimensional structure as the native tyrosine kinase 
domain from which the mutant is derived. 

By having substantially the same three-dimensional 
structure is meant having a set of atomic structure 
coordinates that have a root -mean- square deviation of 
10 less than or equal to about 2A when superimposed with 

the atomic structure coordinates of the native tyrosine 
kinase domain from which the mutant is derived when at 
least about 50% to 100% of the Ca atoms of the native 
tyrosine kinase domain are included in the 
15 superposition. 

Amino acid substitutions, deletions and additions 
which do not significantly interfere with the three- 
dimensional structure of the tyrosine kinase domain will 
depend, in part, on the region of the tyrosine kinase 
20 domain where the substitution, addition or deletion 
occurs. In highly variable regions of the molecule, 
such as those shown in FIG. 6, non-conservative 
substitutions as well as conservative substitutions may 
be tolerated without significantly disrupting the three- 
25 dimensional structure of the molecule. In highly 

conserved regions, or regions containing significant 
secondary structure, such as those regions shown in FIG. 
6, conservative amino acid substitutions are preferred. 

Conservative amino acid substitutions are well- 
known in the art, and include substitutions made on the 
basis of similarity in polarity, charge, solubility, 
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hydrophobicity, hydrophi licity and/or the amphipathic 
nature of the amino acid residues involved. For 
example, negatively charged amino acids include asparcic 
acid and glutamic acid; positively charged amino acids 
include lysine and arginine; amino acids with uncharged 
polar head groups having similar hydrophilicity values 
include the following: leucine, isoleucine, valine; 
glycine, alanine; asparagme, glutamme; serine, 
threonine; phenylalanine, tyrosine. Other conservative 
amino acid substitutions are well known in the art. 

For tyrosine kinase domains obtained in whole or in 
part by chemical synthesis, the selection of amino acids 
available for substitution or addition is not limited co 
the genetically encoded amino acids. Indeed, the 
mutants described herein may contain non- genet ically 
encoded amino acids. Conservative amino acid 
substitutions for many of the commonly known non- 
genetically encoded amino acids are well known in the 
art. Conservative substitutions for other amino acids 
can be determined based on their physical properties as 
compared to the properties of the genetically encoded 
amino acids . 

In some "instances , it may be particularly 
advantageous or convenient to substitute, delete and/or 
add amino acid residues to a native tyrosine kinase 
domain in order to provide convenient cloning sites in 
cDNA encoding the polypeptide, to aid in purification of 
the polypeptide, and for crystallization of the 
polypeptide. Such substitutions, deletions and/or 
additions which do not substantially alter the three 
dimensional structure of the native tyrosine kinase 
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domain will be apparent to chose of ordinary skill m 
the art. 

It should be noted that the mutants contemplated 
herein need not exhibit PTK activity. Indeed, amino 
5 acid substitutions, additions or deletions that 

interfere with the kinase activity of the tyrosine 
kinase domain but which do not significantly alter the 
three-dimensional structure of the domain are 
specifically contemplated by the invention. Such 

10 crystalline polypeptides, or the atomic structure 

coordinates obtained therefrom, can be used to identify 
compounds that bind to the native domain. These 
compounds may affect the activity or the native domain. 
The derivative crystals of the invention generally 

15 comprise a crystalline tyrosine kinase domain 

polypeptide in covalent association with one or more 
heavy metal atoms. The polypeptide may correspond to a 
native or a mutated tyrosine kinase domain. Heavy metal 
atoms useful for providing derivative crystals include, 

2 0 by way of example and not limitation, gold, mercury, 
etc . 

The co-crystals of the invention generally comprise 
a crystalline tyrosine kinase domain polypeptide in 
association with one or more compounds. The association 
25 may be covalent or non-covalent . Such compounds 

include, but are not limited to, cof actors, substrates, 
substrate analogues, inhibitors, allosteric effectors, 
etc . 
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IV. Three Dime nsional Structu re Dete rmination Using X- 
rav Crystallography 

X-ray crystallography is a method of solving the 
three dimensional structures of molecules. The 
structure of a molecule is calculated from X-ray 
diffraction patterns using a crystal as a diffraction 
grating. Three dimensional structures of protein 
molecules arise from crystals grown from a concentrated 
aqueous solution of that protein. The process of X-ray 
crystallography can include the following steps: 

(a) synthesizing and isolating a polypeptide; 

(b) growing a crystal from an aqueous solution 
comprising the polypeptide with or without a 
modulator; and 

(c) collecting X-ray diffraction patterns from the 
crystals, determining unit cell dimensions and 
symmetry, determining electron density, 
fitting the amino acid sequence of the 
polypeptide to the electron density, and 
refining the structure. 

Production of P olypeptides 

The native and mutated tyrosine kinase domain 
polypeptides described herein may be chemically 
synthesized in whole or part using techniques that are 
well-known in the art e.g. . Creighton, 1983). 

Alternatively, methods which are well known to those 
skilled in the art can be used to construct expression 
vectors containing the native or mutated tyrosine kinase 
domain polypeptide coding sequence and appropriate 
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:ranscnptionai/translational control signals. These 
methods include in vitro recombinant DNA techniques, 
synthetic techniques and in vivo recombination/ genetic 
recombination. See, for example, the techniques 
5 described in Maniatis et al . , 1989 and Ausubel et al . , 
1989 . 

A variety of host -expression vector systems may be 
utilized to express the tyrosine kinase domain coding 
sequence. These include but are not limited to 

10 microorganisms such as bacteria transformed with 

recombinant bacteriophage DNA, plasmid DNA or cosmid DNA 
expression vectors containing the tyrosine kinase domain 
coding sequence; yeast transformed with recombinant 
yeast expression vectors containing the tyrosine kinase 

15 domain coding sequence; insect cell systems infected 
with recombinant virus expression vectors ( e . g . , 
baculovirus) containing the tyrosine kinase domain 
coding sequence; plant cell systems infected with 
recombinant virus expression vectors (e^3+, cauliflower 

20 mosaic virus, CaMV; tobacco mosaic virus, TMV) or 

transformed with recombinant plasmid expression vectors 
( e.g. , Ti plasmid) containing the tyrosine kinase domain 
coding sequence; or animal cell systems. The expression 
elements of these systems vary in their strength and 

25 specificities. 

Depending on the host/vector system utilized, any 
of a number of suitable transcription and translation 
elements, including constitutive and inducible 
promoters, may be used in the expression vector. For 

30 example, when cloning in bacterial systems, inducible 
promoters such as pL of bacteriophage X, plac, ptrp, 
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ptac (ptrp-lac hybrid promoter) and the like may be 
used; when cloning in insect cell systems, promoters 
such as the baculovirus poiyhedrm promoter may be used; 
when cloning in plant cell systems, promoters derived 
5 from the genome of plant cells ( e.g. , heat shock 
promoters; the promoter for the small subunit of 
RUBISCO; the promoter for the chlorophyll a/b binding 
protein) or from plant viruses ( e.g. . the 35S RNA 
promoter of CaMV; the coat protein promoter of TMV) may 

10 be used; when cloning in mammalian cell systems, 

promoters derived from the genome of mammalian cells 
( e.g. . metallothionein promoter) or from mammalian 
viruses ( e ,g . . the adenovirus late promoter; the 
vaccinia virus 7.5K promoter) may be used; when 

15 generating cell lines that contain multiple copies of 
the tyrosine kinase domain DNA, SV40-, BPV- and EBV- 
based vectors may be used with an appropriate selectable 
marker. 

Methods describing methods of DNA manipulation, 
20 vectors, various types of cells used, methods of 

incorporating the vectors into the cells, expression 
techniques, protein purification and isolation methods, 
and protein concentration methods are disclosed in 
detail with respect to the protein PYK-2 in PCT 
25 publication WO 96/18738. This publication is 

incorporated herein by reference in its entirety, 
including any drawings. Those skilled in the art will 
appreciate that such descriptions are applicable to the 
present invention and can be easily adapted to it. 
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Crystals are grown from an aqueous solution 
containing the purified and concentrated polypeptide by 
a variety of techniques. These techniques include 
b batch, liquid, bridge, dialysis, vapor diffusion, and 

hanging drop methods. McPherson, 1982, John Wiley, New 
York; McPherson, 1990, Eur. J. Biochem. 189:1-23; 
Webber, 1991, Adv. Protein Chem. 41:1-36, incorporated 
by reference herein in its entirety, including all 

10 figures, tables, and drawings. 

Generally, the native crystals of the invention are 
grown by adding precipitants to the concentrated 
solution of the polypeptide corresponding to the PTK 
catalytic domain. The precipitants are added at a 

15 concentration just below that necessary to precipitate 

the protein. Water is removed by controlled evaporation 
to produce precipitating conditions, which are 
maintained until crystal growth ceases. 

For crystals of the invention, it has been found 

2 0 that hanging drops containing about 2.0 of tyrosine 

kinase domain polypeptide (10 mg/mL in lOmM Tris-HCl, pH 
8.0, 10 mM NaCl and 2 mM dithiothreitol ) and 2 . 0 
reservoir solution (16% w/v polyethylene glycol MW 
10000, 0.3 M <NH 4 ) 2 S0 4 , 5% v/v ethylene glycol or 

2 5 glycerol and 100 mM bis-Tris, pH 6.5) suspended over 0.5 
mL reservoir buffer for about 3-4 weeks at 4°C provide 
crystals suitable for high resolution X-ray structure 
determination . 

Those of ordinary skill in the art will recognize 

30 that the above-described crystallization conditions can 
be varied. Such variations may be used alone or in 
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combination, and include polypeptide solutions 
containing polypeptide concentrations between about 1 
mg/mL and about 60 mg/mL, Tris-HCl concentrations 
oetween about 10 mM and about 200 mM, di thiothrei tol 
concentrations between about 0 mM and about 20 mM, pH 
ranges between about 5.S and about 7.5; and reservoir 
solutions containing polyethylene glycol concentrations 
between about 10% and about 30% (w/v) , polyethylene 
glycol molecular weights between about 1000 and about 
20,000, (NK,) 2 SC 4 concentrations between about 0.1 M and 
about 0.5 M, ethylene glycol or glycerol concentrations 
between about 0% and about 20% (v/v) , bis-Tris 
concentrations between about 10 mM and about 200 rnM, pH 
ranges between about 5.5 and about 7.5 and temperature 
ranges between about 0° C and about 25°C. Other buffer 
solutions may be used such as HKPES buffer, so long as 
the desired pH range is maintained. 

Derivative crystals of the invention can be 
obtained by soaking native crystals in mother liquor 
containing salts of heavy metal atoms. It has been 
found that soaking a native crystal in a solution 
containing about 0 . 1 mM to about 5 mM thimerosal, 4- 
chloromeruribenzoic acid or KAu(CN) 2 for about 2 hr to 
about 72 hr provides derivative crystals suitable for 
use as isomorphous replacements in determining the X-ray 
crystal structure of the tyrosine kinase domain 
polypeptide . 

Co-crystals of the invention can be obtained by 
soaking a native crystal in mother liquor containing 
compound that bind the kinase domain, or described 
above, or can be obtained by co-crystaiiizing the kinase 
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domain polypeptide ir. the presence or one cr more 
binding compounds . 

For co-crystals of tyrosine kinase domain 
polypeptide in co-complex with AMP-PCP, it has been 
5 found that co-crystallizing the kinase domain 

polypeptide in the presence of AMP-PCP using the above- 
described crystallization conditions for obtaining 
native crystals with a polypeptide solution additionally 
containing 10 mM AMP-PCP and 20 mM MgCl 2 yields co- 

10 crystals suitable for the high resolution structure 
determination by X-ray crystallography. Of course, 
those having skill in the art will recognize that the 
concentrations of AMP-PCP and MgCl 2 in the polypeptide 
solution can be varied, alone or in combination with the 

15 variations described above for native crystals. Such 

variations include polypeptide solutions containing AMP- 
PCP concentrations between 0.1 mM and 50 mM and MgCl 2 
concentrations between 0 mM and 50 mM. 

Crystals comprising a polypeptide corresponding to 

20 a PTK catalytic domain complexed with a compound can be 
grown by one of two methods. In the first method, the 
modulator is added to the aqueous solution containing 
the polypeptide corresponding to the PTK catalytic 
domain before the crystal is grown. In the second 

25 method, the modulator is soaked into an already existing 
crystal of a polypeptide corresponding to a PTK 
catalytic domain. 
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Crystall ing rC FR 

In one illustrative embodiment, the invention 
provides crystals of FGFR1 . The crystals were obtained 
by the methods provided in the Examples. The FGFR1 
crystals, which may be native crystals, derivative 
crystals or co-crystals, have monoclinic unit cells 
(i.e., unit cells wherein a^b^c; cr=y=90°; and B>90°) and 
space group symmetry C2 . There are two FGFR1 molecules 
in the asymmetric unit, related by an approximate two- 
fold axis. 

Two forms of crystalline FGFR1 were obtained. In 
one form (designated "C2-A form"), the unit cell has 
dimensions of a=208.3 A, b=57.2 A, c=65.5 A and 
3=107.2°. In another form (designated "C2-B form"), the 
unit cell has dimensions of a=211.6 A, b-51.3 A, c-66.1 
A and p=107.7°. 

Three distinct two-fold related FGFR1 dimers are 
observed in both the C2-A and C2-B forms of the FGFR1 
crystal, one non-crystallographically related dimer and 
two crystallographically related dimers. The non- 
crystal lographically related dimer comprises the two 
molecules in the asymmetric unit. The residues making 
up the dimer interface are located in C-termmal lobe. 
In this dimer, the C-termmal lobes abut with the N- 
terminal lobes distal to one another. The total amount 
of surface area buried in the surface is about 950 A 2 . 
Very few of the interactions in the interface are of a 
specific nature, e.g. , hydrogen -bonding or close packing 
of hydrophobic residues. 

There are two crystallographically-related dimers 
in the C2 lattice. In the first dimer, the residues 
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that constitute the dimer interface are limited to those 
in the [j-sheet of the N-terminal lobe (amino acid 
residues 477, 479, 498, 506, 508 and 496). The total 
surface area ouried in this interface is about 670 A* . 
5 The interactions are rather specific. Three hydrophobic 
residues which are partially solvent -exposed in the 
monomer, Val-479, Ile-498 and Val-508, come together 
with their two- f old-related residues to form a compact 
hydrophobic plug. This plug is capped on either side by 

10 a salt bridge between Arg-477 and Glu-496. In addition, 
two main- chain hydrogen -bonds connect the (J- sheets of 
the two monomers at the start of (33 (amino acid residues 
506 and 508) . The residues in this dimer interface, or 
their residue character, are generally conserved in the 

15 mammalian FGF receptors, but not in the invertebrate 
homologues . 

The other crystallographically-related dimer buries 
about 1650 A 2 in its interface. In this dimer, the aC 
helices of the two monomers are nearly parallel and 

20 contact each other at their C-terminal ends. Met -534 

and Met-537 are in van der Waals contact with their two- 
fold-related residues. Other hydrophobic contacts 
involve Pro-466 with Ile-648 and Pro-469 with Ile-676 
and Thr-678. In addition, hydrogen bonds (side-chain to 

25 main-chain) are made between Arg-470 and Lys-618 and 

between His-649 and Glu-464, and there are several water 
molecules that bridge the two monomers through hydrogen 
bonding . 

In the C2-B form of the crystal, the monomers of 
3 0 this second crystallographically-related dimer are 
shifted slightly with respect to one another (6° 
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rotation), indicating that this interface is somewhat 
fluid. 

In both of the crystallographical ly- related dimers, 
the N- termini of the two molecules comprising the dimer 
point in the same direction and are reasonably close to 
one another. 

Determining Unit Cell Dimensions and the Three 
Dimensional Structure of a Pol ypeptide or Polypeptide 
Complex 

Once the crystal is grown, it can be placed in a 
glass capillary tube and mounted onto a holding device 
connected to an X-ray generator and an X-ray detection 
device. Collection of X-ray diffraction patterns are 
well documented by those in the art. Ducruix and Geige, 
1992, IRL Press, Oxford, England, and references cited 
therein. A beam of X-rays enter the crystal and then 
diffract from the crystal. An X-ray detection device 
can be utilized to record the diffraction patterns 
emanating from the crystal. Although the X-ray 
detection device on older models of these instruments is 
a piece of film, modern instruments digitally record X- 
ray diffraction scattering. 

Methods for obtaining the three dimensional 
structure of the crystalline form of a peptide molecule 
or molecule complex are well known in the art. Ducruix 
and Geige, 1992, IRL Press, Oxford, England, and 
references cited therein. The following are steps in 
the process of determining the three dimensional 
structure of a molecule or complex from X-ray 
diffraction data. 
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After the X-ray diffraction patterns are collected 
from the crystal, the unit ceil dimensions and 
orientation in the crystal can be determined. They can 
be determined from the spacing between the diffraction 
5 emissions as well as the patterns made from these 

emissions. The unit cell dimensions are characterized 
in three dimensions in units of Angstroms (one A = lCT 10 
meters) and by angles at each vertices. The symmetry of 
the unit cell in the crystals is also characterized at 
10 this stage. The symmetry of the unit cell in the 

crystal simplifies the complexity of the collected data 
by identifying repeating patterns. Application of the 
symmetry and dimensions of the unit cell is described 
below. 

15 Each diffraction pattern emission is characterized 

as a vector and the data collected at this stage of the 
method determines the amplitude of each vector. The 
phases of the vectors can be determined using multiple 
techniques. In one method, heavy atoms can be soaked 

20 into a crystal, a method called isomorphous replacement, 
and the phases of the vectors can be determined by using 
these heavy atoms as reference points in the X-ray 
analysis. Otwinowski, 1991, Daresbury, United Kingdom, 
80-86. The isomorphous replacement method usually 

25 requires more than one heavy atom derivative. In 

another method, the amplitudes and phases of vectors 
from a crystalline polypeptide with an already 
determined structure can be applied to the amplitudes of 
the vectors from a crystalline polypeptide of unknown 

3 0 structure and consequently determine the phases of these 
vectors. This second method is known as molecular 
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replacement and the protein structure whicn is used as a 
reference must have a closely related structure to the 
protein of interest. Naraza, 1994, Proteins 11:281-296. 
Thus, the vector information from a PTK of known 
structure, such as those reported herein, are useful for 
the molecular replacement analysis of another PTK with 
unknown structure . 

Once the phases of the vectors describing the unit 
cell of a crystal are determined, the vector amplitudes 
and phases, unit cell dimensions, and unit cell symmetry 
can be used as terms in a Fourier transform function. 
The Fourier transform function calculates the electron 
density in the unit cell from these measurements. The 
electron density that describes one of the molecules or 
one of the molecule complexes in the unit cell can be 
referred to as an electron density map. The amino acid 
structures of the sequence or the molecular structures 
of compounds complexed with the crystalline polypeptide 
may then fit to the electron density using a variety of 
computer programs. This step of the process is 
sometimes referred to as model building and can be 
accomplished by using computer programs such as 
TOM/FRODO. Jones, 1985, Methods in Enzymology 115:157- 
171. 

A theoretical electron density map can then be 
calculated from the amino acid structures fit to the 
experimentally determined electron density. The 
theoretical and experimental electron density maps can 
be compared to one another and the agreement between 
these two maps can be described by a parameter called an 
R-factor. A low value for an R-factor describes a high 



WO 98/07835 



PCT/US97/ 14885 



69 



10 



degree of overlapping electron density between a 
theoretical and experimental electron density map. 

The R- factor is then minimized by using computer 
programs that refine the theoretical electron density 
map. A computer program such as X-PLOR can be used for 
model refinement by those skilled in the art. Brunger, 
1992, Nature 355:472-475. Refinement may be achieved in 
an iterative process. A first step can entail altering 
the conformation of atoms defined in an electron density 
map. The conformations of the atoms can be altered by 
simulating a rise in temperature which will increase the 
vibrational frequency of the bonds and modify positions 
of atoms in the structure. At a particular point in the 
atomic perturbation process, a force field, which 
15 typically defines interactions between atoms in terms of 
allowed bond angles and bond lengths, Van der Waals 
interactions, hydrogen bonds, ionic interactions, and 
hydrophobic interactions, can be applied to the system 
of atoms. Favorable interactions may be described in 
20 terms of free energy and the atoms can be moved over 

many iterations until a free energy minimum is achieved. 
The refinement process can be iterated until the R- 
f actor reaches a minimum value. 

The three dimensional structure of the molecule or 
25 molecule complex is described by atoms that fit the 

theoretical electron density characterized by a minimum 
R-value. A file can then be created for the three 
dimensional structure that defines each atom by 
coordinates in three dimensions. Examples of such 
structural coordinate files are defined in Table 1, 
Table 2, Table 3, and Table 4. 
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v. Structure s of FgHU 

The present invention provides high - resolut ion 
three-dimensional structures and atomic structure 
5 coordinates of crystalline FGFR1 and crystalline 
FGFR1 : AMP-PCP co-complex as determined by X-ray 
crystallography. The specific methods used to obtain 
the structure coordinates are provided in the examples. 
The atomic structure coordinates of crystalline FGFR1 , 
10 obtained from the C2-A form of the crystal to 2.0 A 

resolution, are listed in Table 3; the coordinates of 
crystalline FGFR1 : AMP-PCP co-complex, obtained from the 
C2-A form of the crystal to 2.3 A resolution are listed 
in Table 4 . 

15 Those having skill in the art will recognize that 

atomic structure coordinates as determined by X-ray 
crystallography are not without error. Thus, it is to 
be understood that any set of structure coordinates 
obtained for crystals of FGFR1, whether native crystals, 

20 derivative crystals or co-crystals, that have a root 

mean square deviation ("r.m.s.d. ") of less than or equal 
to about 1.5 A when superimposed, using backbone atoms 
(N, C 0 , C and 0) , on the structure coordinates listed in 
Table 3 or Table 4 are considered to be identical with 

25 the structure coordinates listed in the Tables when at 
least about 50% to 100V of the backbone atoms of FGFR1 
are included in the superposition. 

Referring now to FIG. 1, the overall structure of 
FGFRl is bi-lobate. The N-terminal lobe of FGFR1 spans 

30 ammo acid residues 456-567 (FIG. 3) and comprises a 

curled &-sheet of five ant i -parallel strands {(31-35) and 
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one a-helix (aC) . The C-terminal lobe sDans ammo acid 
residues 568-765 (FIG. 3) and comprises two B-strands 
(37, 38} and seven a-helices (aD, aE, aEF, aF-al). The 
secondary structure nomenclature follows that used for 
IRK (Hubbard et al . , 1994) which in turn is based on the 
assignments for cAPK (Knighton et al . , 1991). FIG. 2 
shows a stereo view of a C„ trace of FGFR1 in the same 
orientation as FIG. 1. 

A structure-based sequence alignment of the 
tyrosine kinase domains of human fibroblast growth 
factor receptor 1 (human FGFR1 ; labelled FGFR1 ) , human 
fibroblast growth factor receptors 2, 3 and 4 (labelled 
FGFR2, FGFR3 and FGFR4 , respectively), a D . melanogaster 
homologue (labelled DFDFR1) , a C elegans homologue 
(labelled EGL-15) and insulin receptor kinase (labelled 
IRK), is shown in FIG. 3. The sequence of FGFR1 , which 
is not shown in FIG. 3 is identical to the sequence of 
FGFR1 except that FGFR1 has the following amino acid 
substitutions and additions: Cys-488 - Ala, Cys-584 - 
Ser, Leu-457 - Val and an additional five N-terminal 
amino acids (Ser-Ala-Ala-Gly-Thr) . The secondary 
structure assignments for FGFR1 and IRK were obtained 
using the Kabsch and Sander algorithm (Kabsch and 
Sander, 1983) as implemented in PROCHECK (Laskowski et 
al., 1993). In the FGF receptor sequences, a period 
represents sequence identity to FGFR1 . In the IRK 
sequence, residues that are identical to FGFR1 are 
highlighted. A hyphen denotes an insertion. 

The numbers under the EGL-15 sequence represent the 
fractional solvent accessibility ( FSA2 ) of the residue 
in the FGFR1 structure. The FSA ratio is the ratio of 
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the solvent -accessible surface area of a residue :n a 
Gly-X-Gly tripeptide compared to that in the FGFR1 
structure. A value of 0 represents an FSA oetween 0.00 
and 0.09; 1 represents an FSA between 0.10 and 0.19, 
etc. The higher the value, the more sol vent - exposed the 
residue. An asterisk or pound sign in the FSA line 
indicates that the residue (asterisk) or side chain 
(pound sign) is not included in the atom model due to 
disorder. The numbers below the FSA line are the FSAs 
for those residues that form part of a dimer interface. 

The amino acid residue numbers for FGFR1 , and hence 
FGFR1 , and IRK provided in FIG. 3 are used in the 
discussion that follows. Significant differences in the 
N-terminal lobe of FGFRl as compared to IRK are found in 
the loops between 3 strands and in aC. Residues from the 
end of 31 through the beginning of (32 (amino acid 
residues 485-490) form the nucleot ide -binding loop, 
named because of its role in ATP coordination. This 
residue stretch contains the protein kinase -conserved 
GXGXXG sequence motif, where X is any amino acid. This 
loop is poorly ordered in one FGFRl molecule in the 
asymmetric unit and disordered ( i.e. . not included in 
the atomic model) in the other FGFRl molecule in the 
asymmetric unit. The loop between 31 and 33 is 
disordered in both FGFRl molecules comprising the 
asymmetric unit. 

Referring now to FIG. 4A, which provides a ribbon 
diagram of the N- terminal lobes of FGFRl and IRK in 
which the C a atoms of the 3-sheets have been 
superimposed, it can be seen that in FGFRl aC is longer 
by one helical turn than in IRK and is oriented such 



WO 98/07835 



PCT/IJS97/1 4885 



73 

that: residues Lys-514 and Glu-531, which are conserved 
in protein kinases, form a salt bridge (represented by a 
black line) . While no: intending to be bound by theory, 
this salt bridge is believed to be important for proper 
5 positioning of the conserved lysine side chain, which 
coordinates two phosphate oxygens of ATP. The salt 
bridge is observed in the structures of cAPK (Knighton 
et al., 1991) and mi togen-act lvated protein kinase 
(MAPK) (Zhang et al . , 1994). 

10 Referring now to FIG. 4B, which provides a ribbon 

diagram of the C- terminal lobes of FGFR1 and IRK in 
which the C Q atoms of the a-helices have been 
superimposed, a significant difference is found in the 
C- terminal helix of FGFR1 when compared to IRK; helix al 

15 of FGFR1 is longer by seven residues (two helical turns) 
than its counterpart in IRK. The extended length of al 
is presumably important in the biological functioning of 
FGF receptors, since the tyrosine autophosphorylation 
site to which an SH2 domain of PLCy binds is six 

20 residues C-terminal to this helix. 

The structure of FGFR1 displays an open disposition 
of the N- and C-terminal lobes. Despite having 
different sets of lattice contacts, the two FGFR1 
molecules in the asymmetric unit have only a 2° 

25 difference in relative lobe orientation. It appears as 
though the stearic interaction between residues in aC 
(Glu-531 and Met-534) with Phe-642 and Gly-643 of the 
protein kinase-conserved DFG sequence at the beginning 
of the activation loop accounts for the open 

3 0 conformation of FGFR1 . 

The active site of FGFR1 is characterized by at 
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activation loop and nucleotide binding loop. Unlike the 
structure of IRK, in which Tyr-1162 occupies the active 
site of the molecule, the active sites of both FGFR1 
5 molecules in the asymmetric unit are unoccupied. 
The activation loop, which regulates 
phosphorylation, is characterized by at least resides 
640 to 663. Quite surprisingly, while the activation 
loops of FGFR1 and IRK contain the same number of amino 

10 acid residues and share greater than 50% sequence 
homology, the paths of the polypeptide chains are 
strikingly dissimilar, diverging at Ala-640 (Gly-1149 in 
IRK) and reconverging at Val-664 (Val-1173 in IRK) . 
Tyr-653 and Tyr 564 are not bound in the active site. 

15 Instead, these residues point away from it. Tyr-653 is 
in van der Waals contact with several hydrophobic 
residues (Val-664, Leu-672 and Phe-710) and is hydrogen- 
bonded via its hydroxy 1 group to a backbone carbonyi 
oxygen (Leu-672) . Tyr-654 is more solvent exposed than 

20 Tyr-653, and its only van der Waais contact is with Val- 
706 . Temperature factor data suggest that the 
activation loop is relatively mobile and adopts multiple 
conformations . 

The catalytic loop of protein kinases lies between 

25 secondary structure elements aE and 37 and contains an 

invariant aspartic acid residue (Asp-623 in FGFR1) which 
serves as the catalytic base in the phosphotransf er 
reaction, abstracting the proton from the hydroxyl group 
of the substrate tyrosine, serine or threonine. The 

30 catalytic loop sequence of FGFR1 comprises at least 
residues His-621 to Asn-628 (amino acid sequence 
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HRDLAARN) , and is identical to that for IRK and most 
receptor and non-receptor PTKs . 

In addition to the two tyrosine autophosphory la t ion 
sites in the activation loop (Tyr-653 and Tyr-654) , 
5 there are four other autophosphorylation sites present 
in the FGFR1 crystals of the invention: one in the 
juxtamembrane region (Tyr-463) , two in the kinase insert 
(Tyr-583 and Tyr-585) and one in the C-termmal lobe 
(Tyr-730) (Mohamraadi et al . , 1996). They exhibit 
10 varying degrees of conservation in mammalian FGF 

receptors: Tyr-463 and Tyr-585 in FGFR1 and 2; Tyr-583 
in FGFR1 , 2 and 3; and Tyr-730 in FGFR 1,2,3 and 4 
(FIG. 3) . 

Referring now to FIG. 5, the positions of the 

15 autophosphorylation sites are mapped onto the FGFR1 
structure. The juxtamembrane site (Tyr-463) and the 
residues N- terminal to it are disordered in one of the 
FGFR1 molecules in the asymmetric unit. In the other 
molecule in the asymmetric unit Tyr-463 is involved in a 

20 lattice contact. 

The kinase insert region (the region between 
helices aD and aE) contains autophosphorylation sites 
Tyr-583 and "Tyr-585 and is disordered in both FGFR1 
molecules in the asymmetric unit of the C2-A form of the 

25 crystal. In the C2-B form, several lattice contacts 

partially pin down this region in one of the two FGFR1 
molecules in the asymmetric unit, allowing a trace of 
the polypeptide chain to be made. There is no well- 
defined secondary structure for these residues. Tyr- 

30 730, situated in aH in the C-terminal lobe, is nearly 
buried and the side-chain hydroxyl group makes two 
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hydrogen -bonds . The side chains of neighboring Mei~732 
and Met-733 are both buried. Therefore, phosphorylation 
of Tyr-730 would presumably require prior unfolding of 
crH. 

Aside from Tyr-730, the five other 
autophosphorylat ion sites (including Tyr-653 and Tyr- 
654) are found in relatively mobile segments of the 
FGFR1 molecule. While not intending to be bound by 
theory, the spatial positions of the autophosphorylat ion 
sites relative to the active site suggest that 
autophosphorylat ion occurs by a trans mechanism between 
two kinase domains, supporting the hypothesis that 
ligand- induced receptor dimerization is critical for the 
initiation of autophosphorylation events. 

The structure of crystalline FGFR1 : AMP-PCP co- 
complex is essentially similar to that observed for 
crystalline FGFR1 . There are no significant changes in 
the structure of FGFR1 induced by AMP-PCP binding. In 
particular, binding of AMP-PCP, and by extension ATP, 
does not by itself promote lobe closure under the 
crystallization conditions used. Furthermore, 
complexation did not result in any noticeable changes in 
the conformations of the activation and nucleotide- 
binding loops. 

The crystalline FGFR1 : AMP-PCP co-complex contains 
hydrogen bonds that are present between Nl of adenine 
and the amide nitrogen of Ala-564 and between N6 of 
adenine and the carbonyl oxygen of Glu-562. The adenine 
ring is flanked on one side by Leu-484 and Val-492 ( N- 
terminal lobe) and on the other side by Leu-630 
(C- terminal lobe) . The ribose hydroxyl groups make no 



WO 98/07835 



PC T/US97/ 14885 



direct, hydrogen bonds winh protein atoms Lys-5:4 is 
hydrogen -bonded to oxygens of the p- and y - phosphates . 
There is no unambiguous electron density that would 
indicate the positions of Mg v ions. Generally, AMP-PCP 
5 appears to be coordinated rather loosely to 

unphosphorylated FGFR1 , being bound to the "roof" of the 
cleft rather than being tightly sandwiched between the 
two kinase lobes. 



10 Structural Diffe r ences Between FGF-R and I RK 

Several features distinguish the FGF- receptor 
structure from that of the insulin-receptor tyrosine 
kinase. These distinctions are likely to be important 
in signaling by FGF-receptors, and other monomeric 

15 receptors that are believed to undergo ligand- induced 
dimerizat ion. 

The most significant difference between the 
structures of FGFR1 and IRK is the conformation of the 
activation loop. In FGFR1 , the activation loop is 

20 disposed such that the binding site for substrate 

peptides is blocked not by an activation loop tyrosine, 
as in IRK, but by Arg-661 and PTK-invariant Pro-663, 
while the ATP binding site is accessible. This 
represents another molecular mechanism by which a 

25 receptor PTK may be autoinhibi ted . The observed 

autoinhibition in FGFR1 would appear to be weaker than 
that in IRK because of fewer specific interactions made 
by residues in the FGFR1 activation loop {manifested in 
the relatively higher B- values) and the accessibility o 

30 the ATP site. One obvious distinction between the 

insulin and FGF receptor families is that in the former 
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receptors are covalently linked heterotetramers (a ? 3 2 ) , 
whereas in the latter, receptor dimenzat ion is ligand 
dependent. Receptors whose kinase domains are always in 
close proximity may require a stronger autoinhibi t ion 
5 mechanism than those receptors that associate only upon 
ligand binding (Taylor et al . , 1995). Since most growth 
factor receptors undergo ligand-dependent dimerization 
and activation, the FGF receptor autoinhibit ion 
mechanism appears to be a more general one. 

10 

VI . Uses of the Crystals and Atom ic Structure 

Coordinates 

The crystals of the invention, and particularly the 
atomic structure coordinates obtained therefrom, have a 

15 wide variety of uses. For example, the crystals 

described herein can be used as a starting material in 
any of the art -known methods of use for receptor and 
non-receptor tyrosine kinases. Such methods of use 
include, for example, identifying molecules that bind to 

20 the native or mutated catalytic domain of tyrosine 

kinases. The crystals and structure coordinates are 
particularly useful for identifying compounds that 
inhibit receptor and non-receptor tyrosine kinases as an 
approach towards developing new therapeutic agents (se_e_, 

25 e.g, . Levitzki and Gazit, 1995). 

The structure coordinates described herein can be 
used as phasing models for determining the crystal 
structures of additional native or mutated tyrosine 
kinase domains, as well as the structures of co-crystals 

30 of such domains with ligands such as inhibitors, 
agonists, antagonists, and other molecules. The 
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structure coordinates, as well as models of the three- 
dimensional structures obtained therefrom, can also be 
used to aid the elucidation of soiut ion - based structures 
of native or mutated tyrosine kinase domains, such as 
5 those obtained via NMR . Thus, the crystals and atomic 
structure coordinates of the invention provide a 
convenient means for elucidating the structures and 
functions of receptor and non-receptor tyrosine kinases. 
For purposes of clarity and discussion, the 

10 crystals of the invention will be described by reference 
to specific FGFR1 exemplary crystals. Those skilled in 
the art will appreciate that the principles described 
herein are generally applicable to crystals of the 
tyrosine kinase domain of any cytoplasmic tyrosine 

15 kinase that undergoes ligand- induced dimerizat ion or 

receptor tyrosine kinase, including but not limited to 
the tyrosine kinases of FIG. 6. 

VII. .grnirr.ure Determination for PTK^ wi t.h Unknown 
20 . structure Using St ructural Coordinates 

Structural coordinates, such as those set forth in 

Table 1, Table 2, Table 3, and Table 4, can be used to 

determine the three dimensional structures of PTKs with 

unknown structure. The methods described below can 

25 apply structural coordinates of a polypeptide with known 
structure to another data set, such as an amino acid 
sequence, X-ray crys tallographic diffraction data, or 
nuclear magnetic resonance (NMR) data. Preferred 
embodiments of the invention relate to determining the 

30 three dimensional structures of PTKs and related 

polypeptides. These include receptor PTKs such as FGF- 
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R, PDGF-R, KDR , CCK4 , MET , TRKA, AXL , TIE, EPF, RYK , 
DDR, ROS, RET, LTK , R0R1 , and MUSK . Non - receptor PTKs 
such as SRC , BRK , BTK, CSK, A£L , ZAP70, FES , FAK , JAK , 
and ACK can also be used in the methods described 
herein . 

Structures Usin g Amino Acid Homology 

Homology modeling is a method of applying 
structural coordinates of a polypeptide of known 
structure to the amino acid sequence of a polypeptide of 
unknown structure. This method is accomplished using a 
computer representation of the three dimensional 
structure of a polypeptide or polypeptide complex, the 
computer representation of amino acid sequences of the 
polypeptides with known and unknown structures, and 
standard computer representations of the structures of 
amino acids. Homology modeling comprises the steps of 
(a) aligning the amino acid sequences of the 
polypeptides with and without known structure; (b) 
transferring the coordinates of the conserved amino 
acids in the known structure to the corresponding amino 
acids of the polypeptide of unknown structure; refining 
the subsequent three dimensional structure; and (d) 
constructing structures of the rest of the polypeptide. 
One skilled in the art recognizes that conserved amino 
acids between two proteins can be determined from the 
sequence alignment step in step (a) . 

The above method is well known to those skilled in 
the art. Greer, 1985, Science 228, 1055. Blundell et 
al . , 1988, Eur. J. Biochem. 172, 513. A computer 
program currently utilized for homology modeling by 
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those skilled in the art is the Homology module in the 
Insight II modeling package distributed by Molecular 
Simulations Inc. 

Alignment of the amino acid sequence is 
5 accomplished by first placing the computer 

representation of the amino acid sequence of a 
polypeptide with known structure above the amino acid 
sequence of the polypeptide of unknown structure. Amino 
acids in the sequences are then compared and groups of 

10 amino acids that are homologous {e.g., amino acid side 
chains that are similar in chemical nature - aliphatic, 
aromatic, polar, or charged) are grouped together. This 
method will detect conserved regions of the polypeptides 
and account for amino acid insertions or deletions. 

15 Once the amino acid sequences of the polypeptides 

with known and unknown structures are aligned, the 
structures of the conserved amino acids in the computer 
representation of the polypeptide with known structure 
are transferred to the corresponding amino acids of the 

2 0 polypeptide whose structure is unknown. For example, a 
tyrosine in the amino acid sequence of known structure 
may be replaced by a phenylalanine, the corresponding 
homologous amino acid in the amino acid sequence of 
unknown structure . 

25 The structures of amino acids located in non- 

conserved regions are to be assigned manually by either 
using standard peptide geometries or molecular 
simulation techniques, such as molecular dynamics. The 
final step in the process is accomplished by refining 

30 the entire structure using molecular dynamics and/or 
energy minimization. 
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The homology modeling method is well known to -hose 
skilled in the art and has been practiced using 
different protein molecules. The three dimensional 
structure of the polypeptide corresponding to the 
catalytic domain of a serine/ threonine protein kinase, 
myosin light chain protein kinase, was homology modeled 
from the cAMP -dependent protein kinase catalytic 
subunit. Knighton et al . , 1992, Science 258:130-135. 

Structures Using Molecular Replacement 

Molecular replacement is a method of applying the 
X-ray diffraction data of a polypeptide of known 
structure to the X-ray diffraction data of a polypeptide 
of unknown sequence. This method can be utilized to 
define the phases describing the X-ray diffraction data 
of a polypeptide of unknown structure when only the 
amplitudes are known. X-PLOR is a commonly utilized 
computer software package used for molecular 
replacement. Briinger, 1992 , Nature 355:472-475. AMORE 
is another program used for molecular replacement. 
Navaza, 1994, Acta Crystallogr. A50: 157-163. 
Preferably, the resulting structure does not exhibit a 
root -mean- square deviation of more than 3 A. 

A goal of molecular replacement is to align the 
positions of atoms in the unit cell by matching electron 
diffraction data from two crystals. A program such as 
X-PLOR can involve four steps. A first step can be to 
determine the number of molecules in the unit cell and 
define the angles between them. A second step can 
involve rotating the diffraction data to define the 
orientation of the molecules in the unit cell. A third 
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step can be to translate the electron density in three 
dimensions to correctly position the molecules in the 
unit cell. Once the amplitudes and phases of the X-ray 
diffraction data is determined, an fl-factor can be 
5 calculated by comparing electron diffraction maps 

calculated experimentally from the reference data set 
and calculated from the new data set. An R- factor 
between 30-50% indicates that the orientations of the 
atoms in the unit cell are reasonably determined by this 
10 method. A fourth step in the process can be to decrease 
the R-factor to roughly 20% by refining the new electron 
density map using iterative refinement techniques 
described herein and known to those or ordinary skill in 
the art. 

15 

Structures rising NMR Data 

Structural coordinates of a polypeptide or 
polypeptide complex derived from X-ray crystallographic 
techniques can be applied towards the elucidation of 

20 three dimensional structures of polypeptides from 

nuclear magnetic resonance (NMR) data. This method is 
used by those skilled in the art. Wuthrich, 1986, John 
Wiley and Sons, New York : 176- 199 ; Pflugrath et al . , 
1986, J. Molecular Biology 189:383-386; Kline et al., 

25 1986, J. Molecular Biology 189:377-382. While the 

secondary structure of a polypeptide is often readily 
determined by utilizing two-dimensional NMR data, the 
spatial connections between individual pieces of 
secondary structure are not as readily determinable. 

30 The coordinates defining a three-dimensional structure 
of a polypeptide derived from X-ray crystallographic 
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techniques can guide the \ T tfR spect roscopi sr :c an 
understanding of these spatial interactions between 
secondary structural elements m a polypeptide of 
related structure . 

The knowledge of spatial interactions between 
secondary structural elements can greatly simplify 
Nuclear Overhauser Effect (NOE) data from two- 
dimensional NMR experiments. Additionally, applying the 
crystallographic coordinates after the determination of 
secondary structure by NMR techniques only simplifies 
the assignment of NOEs relating to particular ammo 
acids in the polypeptide sequence and does not greatly 
bias the NMR analysis of polypeptide structure. 
Conversely, using the crystallographic coordinates to 
simplify NOE data while determining secondary structure 
of the polypeptide would bias the NMR analysis of 
protein structure. 

As the analysis of polypeptide structure by NMR 
methods is a relatively new technique, the use of 
structural coordinates defining a PTK structure will 
most likely be utilized more frequently in the near 
future. As the method progresses, the three dimensional 
structure analysis of polypeptides of the same size as a 
PTK catalytic domain will become more frequent. 

VIII. Structure-Based Design of Modulators of PTK 

Function Utilizing Structural Coordinates 
Structure-based modulator design and identification 

methods are powerful techniques that can involve 

searches of computer data bases containing a wide 

variety of potential modulators and chemical functional 
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groups. The computerized design and identification of 
modulators is useful as the computer data bases contain 
more compounds than the chemical libraries, often by an 
order of magnitude. For reviews of structure - based drug 
5 design and identification see Kuntz et al . , 1994, Acc. 
Chem. Res. 27:117; Guida, 1994, Current Opinion in 
Struc. Biol. 4: 777; Colman, 1994, Current Opinion in 
Struc. Biol. 4: 868. 

The three dimensional structure of a polypeptide 

10 defined by structural coordinates can be utilized by 
these design methods. The structural coordinates of 
Table 1, Table 2, Table 3, and Table 4 can be utilized 
by this method. In addition, the three dimensional 
structures of receptor and non-receptor PTKs determined 

15 by the homology, molecular replacement, and NMR 

techniques described herein can also be applied to 
modulator design and identification methods. Thus, the 
structures of receptor PTKs, FGF-R, PDGF-R, FLK, CCK4 , 
MET, TRKA, AXL, TIE, EPH, RYK, DDR, ROS , RET, LTK, ROR1 , 

20 and MUSK, can be utilized by the methods described 

herein. The structures of non-receptor PTKs, SRC, BRK, 
BTK , CSK, ABL, ZAP7 0 , FES , FAK, JAK, and ACK, can also 
be utilized" by the rational modulator design method. 

25 Design by March ing Molecular Data Bases 

One method of rational modulator design searches 
for modulators by docking the computer representation of 
compounds from a data base of molecules. Publicly 
available data bases include: 

30 

a) ACD from Molecular Designs Limited 
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b) NCI from National Cancer Institute 

c) CCDC from Cambridge Crystal lographic Data Center 

d) CAST from Chemical Abstract Service 

e) Derwent from Derwent Information Limited 

f) Maybridge from Maybridge Chemical Company LTD 

g) Aldrich from Aldrich Chemical Company 

h) Directory of Natural Products from Chapman & Hall 



One such data base (ACD distributed by Molecular Designs 
Limited Information Systems) contains, for example, 
200,000 compounds that are synthetically derived or are 
natural products. Methods available to those skilled in 
the art can convert a data set represented in two 
dimensions to one represented in three dimensions. 
These methods are enabled by such computer programs as 
CONCORD from Tripos Associates or DB- Converter from 
Molecular Simulations Limited. 

Multiple methods of structure-based modulator 
design are known to those in the art. Kuntz et al . , 
1982, J . Wol. Biol, 162: 269; Kuntz et al . , 1994, 
Acc. Che/n. Res. 27: 117; Meng et al . , 1992, J. Compt. 
Che/n. 13: 505; Bohm, 1994, J. Comp . Aided Molec . Design 
8: 623. 

A computer program widely utilized by those skilled 
in the art of rational modulator design is DOCK from the 
University of California in San Francisco. The general 
methods utilized by this computer program and programs 
like it are described in three applications below. More 
detailed information regarding some of these techniques 
can be found in the Molecular Simulations User Guide, 
1995 . 
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A typical computer program used for this purpose 
can comprise the following steps: 

(a) remove the existing compound from the protein; 

(b) dock the structure of another compound into 
5 the active-site using the computer program (such as 

DOCK) or by interactively moving the compound into the 
active - site ; 

(c) characterize the space between the compound 
and the active-site atoms; 

10 (d) search libraries for molecular fragments which 

(i)can fit into the empty space between the compound and 
the active-site, and (li) can be linked to the compound; 
and 

(e) link the fragments found above to the compound 

15 and evaluate the new modified compound. 

Part (c) refers to characterizing the geometry and 
the complementary interactions formed between the atoms 
of the active -site and the compounds. A favorable 
geometric fit is attained when a significant surface 

20 area is shared between the compound and active-site 

atoms without forming unfavorable steric interactions. 

One skilled in the art would note that the method 
can be performed by skipping parts (d) and (e) and 
screening a data base of many compounds. 

25 Structure-based design and identification of 

modulators of PTK function can be used in conjunction 
with assay screening. As large computer data base of 
compounds (around 10,000 compounds) can be searched in a 
matter of hours, the computer based method can narrow 

3 0 the compounds tested as potential modulators of PTK 
function in cellular assays. 
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The above descriptions of structure-based modulator 
design are not all encompassing and other methods are 
reported in the literature: 

(1) CAVEAT : Bartlett et ai., 1989, in "Chemical and 
Biological Problems in Molecular Recognition" , Roberts, 
S.M.; Ley, S.V.; Campbell, M.M. eds . ; Royal Society of 
Chemistry: Cambridge, ppl82-196. 

(2) FLOG: Miller et al . , 1994, J . Comp . Aided 
Molec. Design 0:153. 

(3) PRO Modulator: Clark et a2 . , 1995, J. Comp. 
Aided Molec. Design 9:13. 

(4) MCSS: Miranker and Karplus, 1991, Proteins: 
Structure, Function, and Genetics 11:29. 

(5) AUTODOCK: Goodsell and Olson, 19 90, Proteins: 
Structure, Function, and Genetics 8:195. 

(6) GRID: Goodford, 1985, J. Med. Chew. 28:849. 

Design by Modifying Compounds i n Complex with PTKs 

Another way of identifying compounds as potential 

modulators is to modify an existing modulator in the 

polypeptide active-site. For example, the computer 

representation of modulators can be modified within the 

computer representation of a PTK active-site. Detailed 

instructions for this technique can be found in the 

Molecular Simulations User Manual, 1995 in LUDI . The 

computer representation of the modulator is modified by 

the deletion of a chemical group or groups or by the 

addition of a chemical group or groups. 

Upon each modification to the compound, the atoms 

of the modified compound and active-site can be shifted 

in conformation and the distance between the modulator 
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and the active-site atons may be scored along with any 
complimentary interactions formed between the two 
molecules. Scoring can be complete when a favorable 
geometric fit and favorable complementary interactions 
5 are attained. Compounds that have favorable scores are 
potential modulators of PTK function. 

Design bv Modifying the Stru cture of Compounds Chat Bind 
PTKs 

10 A third method of structure-based modulator design 

is to screen compounds designed by a modulator building 
or modulator searching computer program. Examples of 
these types of programs can be found in the Molecular 
Simulations Package, Catalyst. Descriptions for using 

15 this program are documented in the Molecular Simulations 
User Guide (1995) . Other computer programs used in this 
application are ISIS/HOST, ISIS/BASE , I SIS/ DRAW) from 
Molecular Designs Limited and UNITY from Tripos 
Associates . 

2 0 These programs can be operated on the structure of 

a compound that has been removed from the active -site of 
the three dimensional structure of a compound- PTK 
complex. Operating the program on such a compound is 
preferable since it is in a biologically active 

2 5 conformation. 

A modulator construction computer program is a 
computer program that may be used to replace computer 
representations of chemical groups in a compound 
complexed with a PTK with groups from a computer data 

30 base. A modulator searching computer program is a 
computer program that may be used to search computer 
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represenr.a::ons of compounds from a computer data base 
that have similar three dimensional structures and 
similar chemical groups as compound bound to a PTK . 

A typical program can operate by using the 
5 following general steps: 

(a) map the compounds by chemical features such as 
by hydrogen bond donors or acceptors, 
hydrophobic/lipophilic sites, positively lonizable 
sites, or negatively ionizable sites; 
10 (b) add geometric constraints to the mapped 

features; and 

(c) search data bases with the model generated in 

(b) . 

Those skilled in the art recognize that for 
15 indolinones, the important chemical features include, 
but are not limited to, a hydrogen bond donor, a 
hydrogen bond acceptor, and two hydrophobic points of 
contact. Those skilled in the art also recognize that 
not all of the possible chemical features of the 
2 0 compound need be present in the model of <b) . One can 

use any subset of the model to generate different models 
for data base searches. 

IX. Organic Synthetic Techniques 

25 

The versatility of computer-based modulator design 
and identification lies in the diversity of structures 
screened by the computer programs. The computer 
programs can search data bases that contain 200,000 
30 molecules and can modify modulators already compiexed 
with the enzyme with a wide variety of chemical 
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functional groups. A consequence of this chemical 
diversity is that a potential modulator of PTK function 
may take a chemical form that is not predictable. A 
wide array of organic synthetic techniques exist in the 
S art to meet the challenge of constructing these 

potential modulators of PTK function. Many of these 
organic synthetic methods are described in detail in 
standard reference sources utilized by those skilled in 
the art. One example of such a reference is March, 

10 1994, Advanced Organi c Chemistry.- Reactions. Mechanisms, 

and Structure . New York, McGraw Hill. Thus, the 
techniques required to synthesize a potential modulator 
of PTK function identified by computer-based methods are 
readily available to those skilled in the art of organic 

15 chemical synthesis. 

X. rftllular A ssays Measuring the Effect of a PTK 
Modular or in S ignal Transduction Pathways 

20 Cellular assays can be used to test the activity of 

a potential modulator of PTK function as well as 
diagnose a disease associated with inappropriate PTK 
activity. A potential modulator of PTK function can be 
tested for activity in vitro by assays that measure the 

25 effect of a potential modulator on the 

autophosphorylation of a particular PTK over-expressed 
in a cell line. Thus, a modulator that acts as a potent 
inhibitor of the catalytic domain corresponding to a PTK 
would decrease the amount of autophosphorylation 

30 catalyzed by that PTK. Potential modulators could also 
be tested for activity in cell growth assays in vitro as 
well as in animal model assays in vivo. 
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In vivo assays are also useful for test;na the 
bioactivity of a potential modulator designed Dy the 
methods of the invention. 

Materials, methods, and experimental data for these 
assays are fully described in WO 96/40116 published or. 
December 19, 1996, entitled "Indolinone Compounds for 
the Treatment of Disease" . This application is 
incorporated herein by reference in its entirety, 
including all drawings, figures, and tables. 

XI- Administration of Modulators of ptk Function as 
Therapeutics for Disease. 

Methods of administering compounds to organisms as 
therapeutics for disease are fully described in WO 
96/40116 published on December 19, 1996, entitled 
"Indolinone Compounds for the Treatment of Disease". 
This application is incorporated herein by reference in 
its entirety, including all drawings, figures, and 
tables . 

EXAMPLES 

The examples below are non- limiting and are merely 
representative of various aspects and features of the 
present invention. The examples provide illustrative 
methods for obtaining crystalline forms of protein 
kinase polypeptides, methods for determining three 
dimensional structures of these protein kinase 
polypeptides, and methods for identifying modulators of 
protein kinases using the three dimensional structures 
of the protein kinases. 



WO 98/07835 



I , 07US97/ 14885 



9 3 

FXAMPLE 1 : X-rav Cry s t a 1 loaraph x c Structure 

Der-PrT.in ar ion of FGFR1 

pnlypeptidp Synthe ses and Isolation 
5 A recombinant baculovirus was engineered to encode 

residues 456-765 of human FGFR1 . A cleavable N-Lerminal 
histidine tag was incorporated to aid in protein 
purification. Three amino acid substitutions were 
introduced: Cys-488 to Ala, Cys-584 to Ser and Leu-457 

10 to Val . The two cysteine substitutions were made to 
prevent the formation of disulfide - linked oligomers, 
which occurs for the native protein. The substitution 
Leu-457 to Val introduced a Ncol cloning site near Met- 
456. The codon for Tyr-766 (TAC) was changed to a stop 

15 codon (TAG) and a tfindlJI-cloning site was generated 
following this stop codon. These substitutions were 
introduced into the full length, human cDNA of FGFR1 in 
ml3MPI9 by site-directed mutagenesis according to the 
manufacturer's protocol (Arnersham) . 

20 The resulting construct was digested with Ncol and 

Hindlll and was ligated into appropriately digested 
pBlueBac HistagB ( Invitrogen) . Transfection of insect 
cells (Sf9) was performed with the BaculoGold 
transfection system according to the manufacturer's 

25 protocol (Pharmingen) . Following identification of 
positive plaques, the recombinant baculovirus was 
amplified to high titer (5xl0 7 virus particles/ml) . Sf9 
cells were grown in 175-cm 2 flasks to a density of 2- 
3x1 0"' per flask and infected with recombinant baculovirus 

30 with a multiplicity of infection {MOD of 10. 

After 48 hr , cells were harvested by centri f ugat ion 
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at 3,000g for 35 mm at 4°C and then lysed in 25 mM 
HEPES (pK 7.5), 150 mM NaCl f 10% glycerol, 1.5 mM MgCl, 
1 % Triton X-100, 10 /xg/ml aprotonm, 10 /ig/ml 
leupeptm, and 1 mM phenylmet hylsul f ony 1 fluoride 
5 (PMSF) . Lysates were centrifuged in a Sorval RC 5C 

(Dupont) for 1 hr at 4°C at 40,000g followed by 
ultracentrif ugation in an XL-80 (Beckman) at 100,000g 
for 1 hr. After centnf ugation, the clarified lysate 
was passed over a Ni 2 ' -chelating column (Pharmacia) , and 
10 the bound histidine- tagged fusion protein was eiuted 

with 100 mM imidazole (pH 7.5) . Pooled fractions were 
loaded onto a Mono Q anion exchange column (Pharmacia) 
and eiuted with a NaCl gradient from 0 to 50 0 mM. 

The fractions containing the fusion protein were 
15 concentrated in a Centricon-30 (Amicon) , and the 

histidine tag was removed by overnight digestion with 
enterokinase (Biozyme) at 20°C. The digestion was 
terminated by the addition of aprotonin, leupeptin, 
PMSF, TPCK, and bovine pancreatic trypsin inhibitor 
20 (BPTI). The cleaved kinase domain was then separated 
from the histidine tag on a Superose 12 si ze -exclusion 
column (Pharmacia) . The eiuted kinase domain was 
further purified on a Mono Q column. The purified 
kinase domain was analyzed by N- terminal sequencing and 
25 mass spectrometry. Five amino acids (SAAGT) remained 
from the histidine tag. The predicted molecular mass 
was confirmed by mass spectrometry. 

Crystal Growth 

30 Purified FGFR1 was concentrated to 20-50 mg/ml and 

exchanged into 10 mM Tris-HCl !pH 8.0), 10 mM NaCl, and 
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2 mM DTT using a Centricon- 30 . Crystals were qrown at 
4°C by vapor diffusion in hanging drops containing 2.0 
fil of 10 mg/ml protein solution and 2.0 /xl ot reservoir 
solution: 16% polyethylene glycol (PEG) 10000, 0.3 M 
5 (NH,),SO«, 5% ethylene glycol, and 100 mM bis-Tris (pH 

6.5). 

Crystals of native FGFR1 were soaked in 500 ml 
stabilizing solution [25% PEG 10000, 0.3 M (NH4).S0 4 , 0.1 
M Bis-Tris (pH 6.5), 5% ethylene glycol] containing 3- 

10 [ (3- (2-carboxyethyi) - 4 -met hylpyrrol - 5 -yl ) met hy lene ] -2- 
indolinone (1-5 mM) or 3 - [4 - (4 - f ormylpiperazme - 1 -yl ) - 
benzylidenyl] -2- indolinone (1 mM) at 4°C for 24 to 48 
hours. The final soaking concentration of DMSO was 
between 1 to 5%. The crystals cracked at higher 

15 concentrations of DMSO. 

Co-crystals of FGFR1 with the inhibitors could also 
be obtained by vapor diffusion in hanging drops 
containing 2.0 /il of 10 mg/ml protein solution and 2.0 
til of reservoir solution containing 1 mM 3-[(3-(2- 

20 carboxyethyl) -4 -methylpyrrol - 5 -yl ) methylene] -2- 
indolinone and 3 - [4- (4 - f ormylpiperazine- l-yl - 
) benzylidenyl] -2-indolinone . 

Co-crystals of FGFR1 complexed with AMP-PCP were 
obtained as described for the creation of native 

25 crystals, except that the protein solution additionally 
contained 10 mM AMP-PCP and 20 mM MgCl 2 . 



Prpparation Of Heavy Atom Derivative Crystals 

Heavy atom derivative crystals were obtained by 
30 soaking FGFR1 native crystals (C2-A form) in a solution 
containing ethylmercurithiosalicylic acid (thimerosal) , 
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KAu{CN) 2 or 4 -chloromercunbenzoic acid, as provided in 
Table 1, infra,, and containing 25% PEG 20000, 0 . 3M 
(NH<) 2 S0 V , 5% ethylene glycol or glycerol, and 100 mM 
bis-Tris (pH 6.5), and were flash-cooled either in 
liquid nitrogen directly (Synchrotron) or in a dry 
nitrogen stream at -175°C (rotating anode). 

Data Collection and Structure D etermination 

For native crystals and crystals comprising the 
nucleotide analog AMP-PCP, data were collected either on 
a Rigaku RU-200 rotating anode operated at 50 kV and 100 
mA (Cu Ka) and equipped with double -focusing mirrors and 
an R-AXIS IIC image plate detector, or at beamline X-4A 
at the National Synchrotron Light Source, Brookhaven 
National Laboratory. Synchrot ron data (X=1.07A) were 
collected on Fuji image plates and read with a Fuji 
scanner. One cryo-cooled crystal was used for each of 
the data sets. To obtain cryo-cooled crystals, crystals 
were soaked in a cryo-protectant solution containing 25% 
PEG 10000, 0.3 M (NH 4 ) 2 S0 4/ 5% ethylene glycol or 
glycerol and 100 mM bis-Tris (pH 6.5), and were flash- 
cooled either in liquid nitrogen directly (synchrotron 
data) or in a dry nitrogen stream at -175°C (rotating 
anode data) . All data were processed using DENZO and 
SCALEPACK. Otwinowski, 1993, "Oscillation data 
reduction program," Proceedings of the CCP4 Study 
Weekend, Sawyer et al . , eds . (Daresbury, United Kingdom: 
SERC Daresbury Laboratory), 56-62. 

For native crystals and crystals comprising the 
nucleotide analog AMP-PCP , a molecular replacement 
solution was found initially for the C2-B crystal form 
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using an IRK search model that consisted of polyalanine 
with the common side chains for residues 993-1263 (FGFRI 
residues 475-754), excluding residues 1094-1105 (kinase 
insert) and 1153-1170 (activation loop) . With AMORE 
5 (Navaza, 1994, AmoRe : an automated package for molecular 
replacement," Acta Crystallogr. AEO: 157-163), using 80% 
of the structure factor amplitudes between 15.0 and 3.5 
A, one of the two molecules in the asymmetric unit was 
located. The correlation coefficient (c.c.) for the 

10 correct i-molecule solution was 0.23 (versus 0.20 for 
the highest incorrect solution) . This molecule was 
rigid body-refined in X-PLOR (Brunger, 1992, X-PLOR 
(Version 3.1) Manual (New Haven, Conneticut: The Howeard 
Hughes Medical Institute and Department of Molecular 

15 Biophysics and Biochemistry, Yale Uiversity) ) , first as 
one rigid body unit, then as two units each comprising a 
lobe of the kinase. Rigid body refinement (12.0-3.5 A, 
F>3o) resulted in a relative rotation of the two lobes 
of -10° and an increase of the c.c. from 0.20 to 0.25. 

2 0 The rigid body-refined molecule was then used as a new 
search model in AMORE, and this time both molecules in 
the asymmetric unit were located. The c.c. for the 
correct 2-molecule solution was 0.35 (versus 0.27 for 
the highest incorrect solution) . 

25 Multiple cycles of model building and refinement 

against 6.0-2.4 A data resulted in the addition to the 
model of many of the side chains and some of the missing 
polypeptide chain. Model building was performed using 
TOM/FRODO (Jones, 1985, "Diffraction methods for 

30 biological macromolecules . Interactive computer 

graphics: FRODO, " Methods in Enzymology 115: 157-171) 
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and con]uga:e-gradient minimization and simulated 
annealing were performed using X-PLOR. 3runger, supra. 
At this stage, the R-value was 30% (free R-value of 
36%) . To help expedite model building and refinement, 
experimental phases were obtained. Because crystals 
grown in the presence of ethylene glycol were easier to 
manipulate than those grown in glycerol, several heavy- 
atom derivative data sets were collected from C2-A 
crystals that had been soaked in various heavy atom 
solutions. The C2-B structure was subsequently refined 
against 6.0-2.4 A data to an R-value of 23.8% (free R- 
value of 30.4%) with r.m.s.d. values of 0.008 A for bond 
distances and 1.4° for bond angles. 

Molecular replacement was used to locate the two 
FGFR1 molecules (designated FLGK-A and FLGK-B) in the 
asymmetric unit of the C2-A crystal form. Using AMORE 
with 80% of structure factor amplitudes between 15.0 and 
3.5 A and the C2-B model, the c.c. for the correct 2- 
molecule solution was 0.62 (versus 0.35 for the highest 
incorrect solution) . Heavy atom positions were 
determined from difference Fourier maps using the 
calculated phases from the partial model. Refinement of 
heavy atom parameters and phase determination were 
performed with MLPHARE (Otwinowski, 1991, "Maximum 
likelihood refinement of heavy atom parameters," 
Isomorphous replacement and anomolous Ssattering, Evans 
and Leslie eds . (Darsbury, United Kingdom: SERC 
Daresbury Laboratory) , 56-62) ) . An initial molecular 
isomorphous replacement (MIR) -phased electron density 
map was calculated with data between 2.0, and 2.B A 
resolution. This map was improved by solvent 
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flattening, histogram matching, and non- crystal iographic 
symmetry (NCS) averaging using DM (Cowtan, 1994, 
"Protein Crystallography," CCP4 and ESF-EACBM Newsletter 
(joint) 31: 34 -38 ) . 
5 Refinement of the C2-A FGFR1 structure against 6.0- 

2.0 A data proceeded by conjugate -gradient minimization 
and simulated annealing using X-PLOR. Tight NCS 
restraints were imposed until data to 2.0 A resolution 
were included in the refinement, at which point the 

10 restraints were lifted. An overall anisotropic B-value 
was calculated using X-PLOR and applied to the observed 
structure factors, reducing the R-value by -3%. Water 
molecules whose B-values refined to :>70 A 2 were omitted 
from the subsequent refinement round. The average B- 

15 value is 37.5 A 2 for all protein atoms, 35.4 A* for 

protein atoms in FLGK-A, 3 9.7 A 2 for protein atoms in 
FLGK-B, and 40.2 A 2 for water molecules. The side chains 
for Cys-603 in FLGK-A and FLGK-B and for Met- 534 in 
FLGK-B have been modeled in two different conformations. 

20 Residues that are not included in the atomic model due 

to poor supporting electron density are for FLGK-A: 456- 
463, 486-490, 501-504, 580-591, 763-765; and for FLG-B: 
456-460, 501-504, 578-593, 646-651, 657-659, 762-765. 

The positions of the two AMP-PCP molecules (one per 

25 FGFR1 molecule) were easily identified in 2F ODI(co . COTplex r 
F NlelRmi difference Fourier maps. The AMP-PCP molecule 
bound to FLGK-B is less tightly bound and has been 
modeled with an occupancy of 0.5. 

Table A summarizes the X-ray crystallography data 

30 sets of FGFR1 derivative crystals that were used to 
determine the structures of crystalline FGFR1 and 
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15 



20 



25 



30 



100 

crystalline FGFR1 : AMP- PCP co-complex of the invention. 

TABLR 5 
Data Collection and MiR Phasing Summar\ 





Native 


AMP-PCP 


Thi-T 


Thi-2" 


PCMB' 


KAu(CN) : 


X-ray source 


X-4A 


RU-200 


RU-200 


RU-200 


RU-200 


RU-200 


Resolution limit (A) 


2.0 


2.3 


2.6 


2 8 


2 8 


2 8 


Number of sites 






4 


7 


2 


2 


Cone. (mM)/time(h) 






0.1/24 


0.1/48 


0 2/2 


5 0/72 




4 8(I9.7) C 


4.5(23.3) c 


5.5 


9.8 


6.8 


6.8 


Total observations 


122569 


91324 


55456 


59488 


6798S 


45303 


Unique reflections 


50771 


31997 


42820* 


35538° 


18619 


18202 


Completeness (%) 


97.3(96.3^ 


95.5(93.7)' 


95.0 


967 


98.0 


97.7 


Signal (%l>3o) 


80.7(50.3)* 


79.6(5 l.7) c 


69.8 


66.8 


84.7 


77.6 


*,„•(%) 






17.1 


31.2 


15.4 


15.2 


Phasing power 1 






1.8 


2.0 


1.0 


0.9 


Ru,H,*(Vo) 






0.55 


0.50 


0.81 


084 


Overall POM" 








0.60 







*Thi-l, Thi-2; ethylmcrcurithiosalicylic acid (thimerosal); PCMB: 4-chIoromcrcuriben7oic acid 

»R_ = 100 x E h E 1 |I,(hV<!(h>>|/i:H2 t I,(h) 

'Value in parentheses is for the highest resolution shell. 

"K+h) and l(-h) processed as independent reflections. Anomalous scattering contributions were 
included. 

'R^ = 100 x I!* | |F„(h)±F,(h)HF FM (h)| |/S h |F p (h)|. where F p and Fp„ are the native and derivative 
structure factors, respectively 

Phasing power: r.m.s. heavy atom structure factor / r.m.s. lack of closure (for acentric reflections 
from 20.0 to 2.8A). 

•R^n,,- 100 x2 h ltF fH (h)|-F Jlleafc> (h)|/2 h |Fp H (h)±F p (h)| (for centric reflections from 20 0to2 8A) 
"Figure of merit: JP(4>)exp04>)d<t>/ /P(<J>)d(4>), where P is the probability distribution of the phase 
angle 4> 
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For crystals comprising FGFR1 and compounds 1 and 
2, data were collected on a Rigaku RU-200 rotating anode 
(Cu Kor) operating at 50 kV and 100 mA and equipped with 
double- focusing mirrors and an R-AXI5 IIC image plate 
5 detector. One cryo-cooled crystal was used for each of 
the data sets. Crystals were soaked in a 
cryo-protectant [25% PEG 10000, 0.3 M (NH.),SO., 5% 
ethylene glycol, 100 mM bis-Tris (pH 6.5), and 1 mM : 3- 
[ (3- (2-carboxyethyl) -4 -methylpyrrol - 5 -yl ) methylene) -2- 
10 indolinone (hereafter referred to as compound 1) or 3- 
[4- (4-formylpiperazine-l-yl-)benzylidenyl] -2 -indolinone 
(hereafter referred to as compound 2) and flash-cooled 
in a dry nitrogen stream at -175°C. Data were processed 
using DENZO and SCALEPACK. Otwmowski, 1993, 
15 Proceedings of the CCP4 Study Weekend (Daresbury, United 
Kingdom: SERC Daresbury Laboratory) pp 56-62 

A summary of the data collection parameters are 
included in the following Table 6: 

20 TABLE 6 



25 





Resolution 
limit (A) 


Observa- 
tions (N) 


Complete- 
ness (%) 


Redundan- 
cy 


R, T / (%) 


Signal 
(I>ol) 


compound 
1 


2.5 " 


93535 


97.6 (96.1) 


2.7 


6.8 

(23.0) 


11.8 


compound 

2 


2.4 


94093 


99 1 (97.9) 


3.3 


6.3 

(32.2) 


114 



compound 1 structure: 550 residues, 252 water molecules, 2 compound 1 molecules (4589 atoms) 
compound 2 structure: 550 residues, 248 water molecules, 2 compound 2 molecules (4646 atoms) 



3 0 structure Analyses 

Atomic superpositions were performed with TOSS 
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(Henririckson, 1979) . Per residue solvent accessible 
surface calculations were done with X-PLOR. The surface 
area buried in a dimer interface was calculated with 
GRASP (Nicholls et al . , 1991) using a probe radius of 
1.4 A. The stereochemical quality of the atomic model 
was monitored using PROCHECK (Laskowski ec al., 1993, 
PROCHECK: a computer program to check the stereochemical 
quality of protein structures," J. Appl . Cryst. 26: 283- 
291) . As defined in PROCHECK, 93% of the residues in 
the model have mam-chain torsion angles in the most 
favored Ramachandran regions. There are no residues in 
disallowed regions, and three residues in generously 
allowed regions: Arg-622 in FLGK-A and FLGK-B and Arg- 
554 in FLGK-A. The overall G- factor score is 0.42. 

Table 7 summarizes the X-ray crystallography 
refinement parameters of the structures of crystalline 
FGFR1 and crystalline FGFR1 : AMP-PCP co-compxex of the 
invention. Table 8 summarizes the X-ray crystallography 
refinement parameters for the FGFR1 / compound complexes. 

TABLE 7 

_^ Refinement Parameters 

FGFR1: 550 residues, 252 water molecules (4589 atoms) 

FGFR1: AMP-PCP: 550 residues, 238 water molecules, 2 AMP-PCP molecules (4638 atoms) 

Model d-spacings Reflection R-value' R.m.s d 

s 

(A) (N) (%) bonds (A) angles (°) B-values b 
<*) 

FGFR1: 6.0-2.0 42548 21.3 (26.2) c 0.008 1.3 1.6 

FGFR1:AMP-PCP- 6.0-2.3 26729 20 W27.5) C 0.009 \_4 1 7 

*R-valuc= lOOxEJlF^h)! - jF ti(c (h)|| / E^F^fh)! for reflections with F oos >2o 
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h hor bonded protein atoms 

c Value in parentheses ts the tree R-valuc (Brunger, !°93 ) determined from 5% of the data 



TABL E 3 



Model 


d-spacings (A) 


Reflec- 
tions 


R- 

valuc' (N) 


bonds (A) 


angles (°) 


B- 

vaJues 

(A 2 ) 


compound 
1 


6.0-2 .4 


42548 


19.7 
(27.0P 


0.008 


1.3 


I 6 


compound 


6.0-2.5 


26729 


20.0 
(280y 


0.008 


1.4 


1.7 



10 

'R^ - 100 x S h S, |i,(h) - I(h)°l / S h S, I,(h) 

'Value in parentheses is for the highest resolution shell. 

'R-value = 100 x S„ ||F 0 (h)i - |F c (h)|| / S„ |F 0 (h)|, where F 0 and F c arc the observed and calculated 
structure factors, respectively (F 0 > 2s). 
1 5 'For bonded protein atoms. 

'Value in parentheses is the free R-valuc determined from 5% of the data. 



Atomic Stru ctural Coordinates 

Tables 1 and 2 provide the atomic structural 

20 coordinates of unphosphorylated FGFR1 and 

unphosphorylated FGFR1 : AMP-PCP co-complex, respectively. 
In the Tables, coordinates for both of the FGFR1 
molecules of the dimer comprising the asymmetric unit 
are provided. The amino acid residue numbers coincide 

25 with those used in FIG. 3. In the first FGFR1 molecule 
of the dimer the residue number is preceded by a 1, 
i.e. . residue number 464 of the first FGFR1 molecule of 
the dimer is denoted by "1464". Tables 3 and 4 provide 
the atomic structural coordinates of FGFR1 in complex 

30 with indolinone compounds found to inhibit FGFR1 
function . 
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The following abbreviations are used in the Tables: 
" Atom Typ e" refers to the element whose coordinates 
are provided. The first letter in the colurr.n defines 
the element . 
5 " A . A . " refers to amino acid. 

" X. Y and Z " provide the Cartesian coordinates of 
the element . 

is a thermal factor that measures movement of 
the atom around its atomic center. 
10 " OCC " refers to occupancy, and represents the 

percentage of time the atom type occupies the particular 
coordinate. OCC values range from 0 to 1 , with 1 being 
100% . 

"Ml" or " PRT2 " relate to occupancy, with PRT1 
15 designating the coordinates of the atom when in the 

first conformation and PRT2 designating the coordinates 
of the atom when in the second or alternate 
conformation. 

Structural coordinates for FGFR1 may be modified by 
20 mathematical manipulation. Such manipulations include, 
but are not limited to, crystallographic permutations of 
the raw structure coordinates, f ract ional izat ion of the 
raw structure coordinates, integer additions or 
subtractions to sets of the raw structure coordinates, 
25 inversion of the raw structure coordinates and any 
combination of the above. 

In addition, the structural coordinates can be 
slightly modified and still render nearly identical 
three dimensional structures. Therefore, a measure of a 
30 unique set of structural coordinates is the root-mean- 
square deviation of the resulting structure. Structural 
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coordinates that render three dimensional structures 
that deviate from one another by a root -mean - square 
deviation of less than 1.5 A may be viewed as identical. 

5 KXAMPLE 2 : Computer- Rased Design of Modulators Q± 

PTK Function 

Potential modulators of PTK function were designed 
and identified by operating the program Catalyst on the 

10 structure of 3 - [ ( 3 - (2 -carboxyethyl ) -4 -methylpyrrol - 5 - 
yl) methylene] -2- indolinone. The chemical features 
constraining the search model include a hydrogen bond 
donor, a hydrogen bond acceptor, and two hydrophobic 
points of contact. Approximately 40 compounds were 

15 identified as potential modulators of PTK function using 
this method. 

The compounds identified by the method as potential 
modulators of PTK function were commercially available. 
These compounds were then tested for their ability to 

20 inhibit the FLK PTK in an enzyme linked immunosorbant 
assay (ELISA) . The method of performing this assay is 
taught in WO 96/40116, entitled "Indolinone Compounds 
for the Treatment of Disease," published on December 19, 
1996, invented by Tang et al . , incorporated by reference 

25 herein in its entirety, including all figures, drawings, 
and tables. Flk-1 specific antibodies can be prepared 
from the following protocol: 

Prepare a Tresyl -Activated Agarose/Flk- 1-D column 
30 by incubating 10 ml of Tresyl -Activated Agarose 

with 20 mg of purified GST-Flk-l-D fusion protein 
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in lOOmM sodium bicarbonate (pH buffer 
overnight at 4°C. 

2. Wash the column once with PBS. 

3. Block the excess sites on the column with 2 M 
S glycine for 2 hours at 4°C. 

4. Wash the column with PBS . 

5. Incubate the column with Rabbit anti-Flk-lD 
production bleed for 2 hours at 4°C. 

6. Wash the column with PBS . 

10 7. Eiune antiserum with 100 mM Citric Acid, pH3 . 0 and 
neutralize the eluate immediately with 2 M Tris, pH 
9.0. 

8. Dialyize the eluate against PBS overnight at 4oC 
with 3 changes of buffer (sample to buffer ratio is 

15 1 :100) . 

9. Adjust the dialyized antiserum to 5% glycerol and 
store at -80°C in small aliquotes. 



The Flk-1 ELISA can include a 2 , 2-azino-bis (3 - 
20 ethylbenz - thiazoline-6 -sulfonic acid { ABTS ) solution, 

which can comprise lOOmM citric acid (anhydrous) , 250 mM 
Na 2 HP0 4 (pH 4.0), 0.5 mg/ml ABTS (Sigma catalog no. A- 
1888) . The solution is most appropriately stored in 
dark at 4°C until ready for use. 
25 The FLK-1 specific antibodies can also be purchased 

from Santa Cruz Biotechnology (Catalog No. SC-504) . 

Four of the forty compounds identified as potential 
modulators of PTK function were potent modulators of FLK 
function. These molecules have the following 
30 structures: 
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The modulators inhibit the FLK protein kinase with the 
following IC 50 values: 
5 TABLE 9 



Compound 


FLK kinase 


FLK kinase 


EGFR 


(GF-1R 




IC 50 




1C S0 






( M M) 


(MM) 


(MM) 


(MM) 




compounds 


compounds 








icsted at 10GVM 


tested at 20mM 






I 


14.8 


14 


>I00 


>100 


2 


15.7 


10.6 


>I00 


>I00 


3 


21.4 


16.6 


68 


309 


4 


22.9 


16.4 


>100 


>I00 



The invention illustratively described herein may 
be practiced in the absence of any element or elements, 
5 limitation or limitations which is not specifically 

disclosed herein. The terms and expressions which have 
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been employed are used as terms of description and no" 
of limitation, and there is no intention chat in the use 
of such terms and expressions of excluding any 
equivalents of the features shown and described or 
5 portions thereof, but it is recognized tnat various 
modifications are possible within the scope of the 
invention claimed. Thus, it should be understood that 
although the present invention has been specifically 
disclosed by preferred embodiments and optional 

10 features, modification and variation of the concepts 

herein disclosed may be resorted to by those skilled in 
the art, and that such modifications and variations are 
considered to be within the scope of this invention as 
defined by the appended claims. 

15 Those references not previously incorporated herein 

by reference, including both patent and non-patent 
references, are expressly incorporated herein by 
reference for all purposes. Other embodiments are 
within the following claims. 
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SEQUENCE LISTING 



(i) 



GENERAL INFORMATION: 



(1) APPLICANT: 



SUGEN, INCORPORATED 
351 Galveston Drive 
Redwood City, CA 94 06 3 



(n) TITLE OF INVENTION : 



CRYSTAL STRUCTURES OF A 
PROTEIN TYROSINE KINASE 



(ill) NUMBER OF SEQUENCES: 



(lv) CORRESPONDENCE ADDRESS: 



(A) 
(B) 

<C) 
(D) 
(E) 
(F) 



ADDRESSEE : 
STREET : 

CITY: 
STATE : 
COUNTRY : 
ZIP: 



Lyon & Lyon 

633 West Fifth Street 

Suite 4700 

Los Angeles 

California 

U.S.A. 

90071-2066 



(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: 

(B) COMPUTER: 

(C) OPERATING SYSTEM: 

(D) SOFTWARE: 



3.S" Diskette, 1.44 Mb 

storage 

IBM Compatible 

IBM P.C. DOS 5 . 0 

FastSEQ for Windows 2 . 0 



(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 



To Be Assigned 
Herewith 



(vn) PRIOR APPLICATION DATA: 



(A) APPLICATION NUMBER: 

(B) FILING DATE: 



WO 98/07835 



PCT/US97/148S5 



(vii i) ATTORNEY/ AGENT INFORMATION : 

(A) NAME: Warburg, Ricnard 

(B) REGISTRATION NUMBER: 32,327 

(C) REFERENCE /DOCKET NUMBER : 227/088-PCT 



(ix) TELECOMMUNICATION INFORMATION : 

(A) TELEPHONE: (213) 489-1600 

(B) TELEFAX: (213) 955-0440 

(C) TELEX: 67-3510 



(2) INFORMATION FOR SEQ ID NO : 1 . 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 310 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ix) MOLECULE TYPE : protein 
(in) HYPOTHETICAL: NO 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 

Met Leu Ala Gly Val Ser Glu Tyr Glu Leu Pro Glu Asp Pro Arg Trp 
15 10 15 

Glu Leu Pro Arg Asp Arg Leu Val Leu Gly Lys Pro Leu Gly Glu Gly 
20 25 30 

Cys Phe Gly Gin Val Val Leu Ala Glu Ala lie Gly Leu Asp Lys Asp 
35 40 45 

Lys Pro Asn Arg Val Thr Lys Val Ala Val Lys Met Leu Lys Ser Asp 

50 55 60 

Ala Thr Glu Lys Asp Leu Ser Asp Leu lie Ser Glu Met Glu Met Met 
65 70 75 80 

Lys Met He Gly Lys His Lys Asn He He Asn Leu Leu Gly Ala Cys 
85 90 95 

Thr Gin Asp Gly Pro Leu Tyr Val He Val Glu Tyr Ala Ser Lys Gly 
100 105 110 

Asn Leu Arg Glu Tyr Leu Gin Ala Arg Arg Pro Pro Gly Leu Glu Tyr 
115 120 125 

Cys Tyr Asn Pro Ser His Asn Pro Glu Glu Gin Leu Ser Ser Lys Asp 

130 135 140 
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Leu Val Ser Cys Ala Tyr Gin Val Ala Arg Gly Met Glu Tyr Leu Ala 
145 ' 150 155 160 

Ser Lys Lys Cys He His Arg Asp Leu Ala Ala Arg Asn Val Leu Val 
165 170 175 

Thr Glu Asp Asn Val Met Lys He Ala Asp Phe Gly Leu Ala Arg Asp 
180 IBS 1^0 

He His His He Asp Tyr Tyr Lys Lys Thr Thr Asn Gly Arg Leu Pro 
195 200 205 

Val Lys Trp Met Ala Pro Glu Ala Leu Phe Asp Arg He Tyr Thr His 
210 215 220 

Gin Ser Asp Val Trp Ser Phe Gly Val Leu Leu Trp Glu He Phe Thr 
225 * 230 235 240 

Leu Gly Gly Ser Pro Tyr Pro Gly Val Pro Val Glu Glu Leu Phe Lys 
245 250 255 

Leu Leu Lys Glu Gly His Arg Met Asp Lys Pro Ser Asn Cys Thr Asn 
260 265 270 

Glu Leu Tyr Met Met Met Arg Asp Cys Trp His Ala Val Pro Ser Gin 
275 280 285 

Arg Pro Thr Phe Lys Gin Leu Val Glu Asp Leu Asp Arg He Val Ala 
290 295 300 

Leu Thr Ser Asn Gin Glu 
305 310 



(2) INFORMATION FOR SEQ ID NO : 2 : 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 315 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 

(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

Ser Ala Ala Gly Thr Met Val Ala Gly Val Ser Glu Tyr Glu Leu Pro 
15 10 15 

Glu Asp Pro Arg Trp Glu Leu Pro Arg Asp Arg Leu Val Leu Gly Lys 

20 25 30 
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Pro Leu Gly Glu G;y Ala Phe Gly Gin Val Val Leu Ala Glu A.ri He 
JS 40 45 

Gly Leu Asp Lys Asp Lys Pro Asn Arg Val Thr Lys Val A. a Val Lys 
50 5 5 6 0 

Met Leu Lys Ser Asp Ala Thr Glu Lys Asp Leu Ser Asp Leu lie Ser 
65 70 75 80 

Glu Met Glu Met Met Lys Met He Gly Lys His Lys Asn He Tie Asn 
85 90 95 

Leu Leu Gly Ala Cys Thr Gin Asp Gly Pro Leu Tyr Val He Val Glu 
100 105 110 

Tyr Ala Ser Lys Gly Asn Leu Arg Glu Tyr Leu Gin Ala Arg Arg Pro 
115 120 125 

Pro Gly Leu Glu Tyr Ser Tyr Asn Pro Ser His Asn Pro Glu Glu Gin 
130 135 140 

Leu Ser Ser Lys Asp Leu Val Ser Cys Ala Tyr Gin Val Ala Arg Gly 
145 150 155 160 

Met Glu Tyr Leu Ala Ser Lys Lys Cys lie His Arg Asp Leu Ala Ala 
165 170 175 

Arg Asn Val Leu Val Thr Glu Asp Asn Val Met Lys He Ala Asp Phe 
180 185 190 

Gly Leu Ala Arg Asp He His His He Asp Tyr Tyr Lys Lys Thr Thr 
195 200 205 

Asn Gly Arg Leu Pro Val Lys Trp Met Ala Pro Glu Ala Leu Phe Asp 
210 215 220 

Arg He Tyr Thr His Gin Ser Asp Val Trp Ser Phe Gly Val Leu Leu 
225 230 235 240 

Trp Glu He Phe Thr Leu Gly Gly Ser Pro Tyr Pro Gly Val Pro Val 
245 250 255 

Glu Glu Leu Phe Lys Leu Leu Lys Glu Gly His Arg Met Asp Lys Pro 
260 265 270 

Ser Asn Cys Thr Asn Glu Leu Tyr Met Met Met Arg Asp Cys Trp His 
275 280 285 

Ala Val Pro Ser Gin Arg Pro Thr Phe Lys Gin Leu Val Glu Asp Leu 
290 295 300 



Asp Arg He Val Ala Leu Thr Ser Asn Gin Glu 

305 310 315 
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f2J INFORMATION FOR SEQ ID NO : 3 : 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 51 ammo acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(in) HYPOTHETICAL: NO 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 

Met Arg Gly Ser His His His His His His Gly Met Ala Ser Met Thr 

15 10 15 

Gly Gly Gin Gin Met Gly Arg Asp Leu Tyr Asp Asp Asp Asp Lys Asp 
20 25 30 

Pro Ser Ser Arg Ser Ala Ala Gly Thr Met Val Ala Gly Val Ser Glu 
35 40 45 

Tyr Glu Leu Pro Glu Asp Pro Arg Trp Glu Leu Pro Arg Asp Arg Leu 
50 55 60 

Val Leu Gly Lys Pro Leu Gly Glu Gly Ala Phe Gly Gin Val Val Leu 
65 70 75 80 

Ala Glu Ala lie Gly Leu Asp Lys Asp Lys Pro Asn Arg Val Thr Lys 
85 90 95 

Val Ala Val Lys Met Leu Lys Ser Asp Ala Thr Glu Lys Asp Leu Ser 
100 105 110 

Asp Leu He Ser Glu Met Glu Met Met Lys Met He Gly Lys His Lys 
115 120 125 

Asn He He Asn Leu Leu Gly Ala Cys Thr Gin Asp Gly Pro Leu Tyr 
130 135 140 

Val He Val Glu Tyr Ala Ser Lys Gly Asn Leu Arg Glu Tyr Leu Gin 
145 150 155 160 

Ala Arg Arg Pro Pro Gly Leu Glu Tyr Ser Tyr Asn Pro Ser His Asn 
165 170 175 

Pro Glu Glu Gin Leu Ser Ser Lys Asp Leu Val Ser Cys Ala Tyr Gin 
180 185 190 

Val Ala Arg Gly Met Glu Tyr Leu Ala Ser Lys Lys Cys He His Arg 
195 200 205 
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Asp Leu Ala Ala Arg Ann Val Lcj Va 1 Thr Glu Asp Asn Va 1 Met Lys 
210 215 22 C 

lie Ala Asp Phe Gly Leu Ala Arg Asp He His His He Asp Tyr Tyr 
225 230 23S 240 

Lys Lys Thr Thr Asn Gly Arg Leu Pro Val Lys Trp Met Ala Pro Glu 
245 250 255 

Ala Leu Phe Asp Arg lie Tyr Thr His Gin Ser Asp Val Trp Ser Phe 
260 265 270 

Gly Val Leu Leu Trp Glu He Phe Thr Leu Gly Gly Ser Pro Tyr Pro 
275 280 285 

Gly Val Pro Val Glu Glu Leu Phe Lys Leu Leu Lys Glu Gly His Arg 

290 295 300 

Met Asp Lys Pro Ser Asn Cy3 Thr Asn Glu Leu Tyr Met Met Met Arg 

305 310 315 320 

Asp Cys Trp His Ala Val Pro Ser Gin Arg Pro Thr Phe Lys Gin Leu 
325 330 335 

Val Glu Asp Leu Asp Arg He Val Ala Leu Thr Ser Asn Gin Glu 
340 345 350 



(2) INFORMATION FOR SEQ ID NO : 4 : 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 93 3 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

ATGCTAGCAG GGGTCTCTGA GTATGAGCTT CCCGAAGACC CTCGCTGGGA GCTGCCTCGG 6 0 

GACAGACTGG TCTTAGGCAA ACCCCTGGGA GAGGGCTGCT TTGGGCAGGT GGTGTTGGCA 12 0 

GAGGCTATCG GGCTGGACAA GGACAAACCC AACCGTGTGA CCAAAGTGGC TGTGAAGATG 18 0 

TTGAAGTCGG ACGCAACAGA GAAAGACTTG TCAGACCTGA TCTCAGAAAT GGAGATGATG 24 0 

AAGATGATCG GGAAGCATAA GAATATCATC AACCTGCTGG GGGCCTGCAC GCAGGATGGT 3 00 

CCCTTGTATG TCATCGTGGA GTATGCCTCC AAGGGCAACC TGCGGGAGTA CCTGCAGGCC 36 0 

CGGAGGCCCC CAGGGCTGGA ATACTGCTAC AACCCCAGCC ACAACCCAGA GGAGCAGCTC 42 0 
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TCCTCCAAGG ACCTGG7GTC CTGCGCCTAC CAGGTGGCCC GAGGCATGGA GTATCTGGCC 4 80 

TCCAAGAAGT GCATACACCG AGACCTGGCA GCCAGGAATG TCCTGGTGAC AGAGGACAAT 54 0 

GTGATGAAGA TAGCAGACTT TGGCCTCGCA CGGGACATTC ACCACATCGA CTACTATAAA 6 00 

AAGACAACCA ACGGCCGACT GCCTGTGAAG TGGATGGCAC CCGAGGCATT ATTTGACCGG 66 0 

ATCTACACCC ACCAGAGTGA TGTGTGGTCT TTCGGGGTGC TCCTGTGGGA GATCTTCACT 72 0 

CTGGGCGGCT CCCCATACCC CGGTGTGCCT GTGGAGGAAC TTTTCAAGCT GCTGAAGGAG 78 0 

GGTCACCGCA TGGACAAGCC CAGTAACTGC ACCAACGAGC TGTACATGAT GATGCGGGAC 84 0 

TGCTGGCATG CAGTGCCCTC ACAGAGACCC ACCTTCAAGC AGCTGGTGGA AG AC CTGGAC 900 

CGCATCGTGG CCTTGACCTC CAACCAGGAG TAG 93 3 



(2) INFORMATION FOR SEQ ID NO : 5 : 

(l) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1056 base pairs 

(B ) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ll) MOLECULE TYPE: CDNA 

(XI) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 

ATGCGGGGTT CTCATCATCA TCATCATCAT GGTATGGCTA GCATGACTGG TGGACAGCAA 6 0 

ATGGGTCGGG ATCTGTACGA CGATGACGAT AAGGATCCGA GCTCGAGATC TGCAGCTGGT 12 0 

ACCATGGTAG CAGGGGTCTC TGAGTATGAG CTTCCCGAAG ACCCTCGCTG GGAGCTGCCT 180 

CGGGACAGAC TGGTCTTAGG CAAACCCCTG GGAGAGGGCG CCTTTGGGCA GGTGGTGTTG 240 

GCAGAGGCTA TCGGGCTGGA CAAGGACAAA CCCAACCGTG TGACCAAAGT GGCTGTGAAG 3 00 

ATGTTGAAGT CGGACGCAAC AGAGAAAGAC TTGTCAGACC TGATCTCAGA AATGGAGATG 36 0 

ATGAAGATGA TCGGGAAGCA TAAGAATATC ATCAACCTGC TGGGGGCCTG CACGCAGGAT 4 20 

GGTCCCTTGT ATGTCATCGT GGAGTATGCC TCCAAGGGCA ACCTGCGGGA GTACCTGCAG 4 80 

GCCCGGAGGC CCCCAGGGCT GGAATACTCC TACAACCCCA GCCACAACCC AGAGGAGCAG 54 0 

CTCTCCTCCA AGGACCTGGT GTCCTGCGCC TACCAGGTGG CCCGAGGCAT GGAGTATCTG 600 

GCCTCCAAGA AGTGCATACA CCGAGACCTG GCAGCCAGGA ATGTCCTGGT GACAGAGGAC 66 0 
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lib 

AATGTGATGA AGATAGCAGA CTTTGGCCTC GCACGGGACA TTCACCACAT CGACTACTAT 72 0 

AAAAAGACAA CCAACGGCCG ACTGCCTGTG AAGTGGATGG CACCCGAGGC ATTATTTGAC 78 0 

CGGATCTACA CCCACCAGAG TGATGTGTGG TCTTTCGGGG TGCTCCTGTG GGAGATCTTC 84 0 

ACTCTGGGCG GCTCCCCATA CCCCGGTGTG CCTGTGGAGG AACTTTTCAA GCTGCTGAAG 90 0 

GAGGGTCACC GCATGGACAA GCCCAGTAAC TGCACCAACG AGCTGTACAT GATGATGCGG 96 0 

GACTGCTGGC ATGCAGTGCC CTCACAGAGA CCCACCTTCA AGCAGCTGGT GGAAGACCTG 102 0 

GACCGCATCG TGGCCTTGAC CTCCAACCAG GAGTAG 1056 
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CLAIMS 

What is claimed is: 

1. A crystalline form of a polypeptide 
corresponding to the catalytic domain of a protein 
tyrosine kinase. 

2. The crystalline form of claim 1, wherein said 
protein tyrosine kinase is a receptor protein tyrosine 
kinase . 

3. The crystalline form of claim 2, wherein said 
receptor protein tyrosine kinase is selected from the 
group consisting of PDGF-R, FLK, CCK4 , MET, TRKA, AXL, 
TIE, EPH, RYK, DDR, ROS, RET, LTK, ROR1 , and MUSK. 

4. The crystalline form of claim 1, wherein said 
protein tyrosine kinase is a non-receptor protein 
tyrosine kinase. 

5. The crystalline form of claim 4, wherein said 
non-receptor protein tyrosine kinase is selected from a 
group consisting of SRC, BRK, BTK, CSK, ABL , ZAP70 , FES, 
FAK, JAK, and ACK. 

6. The crystalline form of claim 1, comprising 
one or more heavy metal atoms. 

7. The crystalline form of claim 1, wherein said 
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protein tyrosine kinase is FGFR. 

8. The crystalline form of claim 7, wherein said 
FGFR is FGFR1 . 

5 

9. The crystalline form of claim 8, defined by 
atomic structural coordinates set forth in Table 1. 

10. The crystalline form of claim 7, comprising at 
10 least one compound. 

11. The crystalline form of claim 10, wherein said 
compound is a nucleotide analog. 

15 12. The crystalline form of claim 11, wherein said 

nucleotide analog is AMP-PCP. 

13. The crystalline form of claim 12, defined by 
atomic structural coordinates set forth in Table 2. 



20 



14. The crystalline form of claim 10, wherein said 
compound is an indolinone compound. 



25 



15. The crystalline form of claim 14, wherein said 
indolinone compound has a structure set forth in formula 
I or II: 
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or a pharmaceut ically acceptable salt, isomer, 
metabolite, ester, amide, or prodrug thereof, wherein 
5 (a) A lt A 2 , A 3 , and A< are independently carbon or 

nitrogen; 

(b) Rj is hydrogen or alkyl; 

(c) R 2 is oxygen in the case of an oxindolinone or 
sulfur in the case of a thiolindolinone ; 

10 (d) R 3 is hydrogen; 

(e) R, , R 5 , R 6 , and R 7 are optionally present and are 
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either d) independently selected from the croup 
consisting of hydrogen, alkyl, alkoxy, aryl, aryloxy, 
alkaryl, alkaryloxy, halogen, trihalomethyl , S(0}R, 
SO.NRR', SOjR, SR, NO,, NRR 1 , OH, CN, C(0)R, 0C(0)R, 
5 NHC(0)R, (CH 2 ) n C0 2 R, and CONRR ' or (li) any two adjacent 

R 4 , R s , R 0/ and R, taken together form a fused ring with 
the aryl portion of the oxindo le-based portion of the 
indolinone ; 

(f) R 2 ', R 3 \ R 4 ', R 5 ', and R 6 ' are each 

10 independently selected from the group consisting of 

hydrogen, alkyl, alkoxy, aryl, aryloxy, alkaryl, 
alkaryloxy, halogen, trihalomethyl , S(0)R, SO^NRR ' , S0 3 R, 
SR, N0 2 , NRR 1 , OH, CN, C{0)R, OC(0)R, NHC(0)R, ( CH 2 ) ; ,C0 2 R , 
and CONRR 1 ; 

15 (g) n is 0, 1, 2, or 3; 

(h) R is hydrogen, alkyl or aryl; 

(i) R' is hydrogen, alkyl or aryl; and 

(j) A is a five membered heteroaryl ring selected 
from the group consisting of thiophene, pyrrole, 

20 pyrazole, imidazole, 1 , 2 , 3 - triazole , 1 , 2 , 4 - tr iazole , 

oxazole, isoxazole, thiazole, isothiazole, furan, 1,2,3- 
oxadiazole, 1,2,4 -oxadiazole, 1,2,5 -oxadiazole , 1,3,4- 
oxadiazole , 1,2,3, 4 -oxatriazole , 1,2,3, 5 -oxatr iazole , 
1,2, 3 - thiadiazole , 1,2,4 -thiadiazoie , 1,2, 5 -thiadiazole , 

25 1,3, 4 -thiadiazole , 1,2,3, 4 - thia triazole , 1,2,3,5- 

thiatriazole, and tetrazole, optionally substituted at 
one or more positions with alkyl, alkoxy, aryl, aryloxy, 
alkaryl, alkaryloxy, halogen, trihalomethyl, S(0)R, 
SO.NRR', S0 3 R, SR, NO., NRR 1 , OH, CN, C(0)R, 0C(O)R, 

30 NHC(0)R, (CH 2 ) n C0 2 R or CONRR'. 
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16. The crystalline form of claim lb, wherein said 
indolinone compound is 3 - [ ( 3 - ( 2 - carboxyethyl ) - 4 - 
methylpyrrol - 5 -yl ) methylene] -2 - indol inone . 

17. The crystalline form of claim 15, wherein said 
indolinone compound is 3 - [4 - { 4 - f ormylp Iperaz ine - 1 - 

yl) benzyiidenyl] - 2 - indolinone . 

18. The crystalline form of claim 16, defined by 
the atomic structural coordinates of Table 3 . 

19. The crystalline form of claim 17, defined by 
the atomic structural coordinates of Table 4 . 

20. The crystalline form of claim 1, having 
monoclinic unit cells. 

21. The crystalline form of claim 20, wherein said 
monoclinic unit cells have dimensions of about a=208.3 
A, b=57.8 A, c=65.5 A and 3=107.2°. 

22. The crystalline form of claim 20, wherein said 
monoclinic unit cells have dimensions of about a=211.6 
A, b=51.3 A, c=66.1 A and 3=107.7°. 

23. The crystalline form of claim 10, comprising 
one or more heavy metal atoms. 

24. A polypeptide corresponding to the catalytic 
domain of a protein tyrosine kinase, containing at least 
about 20 amino acid residues upstream of the first 
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glycine in the conserved glycine-rich region of the 
catalytic domain, and at least about 17 amino acid 
residues downstream of the conserved arginine located at 
the C-terminal boundary of the catalytic domain. 

25. The polypeptide of claim 24, wherein said 
protein tyrosine kinase is a receptor protein tyrosine 
kinase . 



10 26. The polypeptide of claim 24, wherein said 

protein tyrosine kinase is a non-receptor protein 
tyrosine kinase. 



27. The polypeptide of claim 25 , wherein said 
15 receptor tyrosine kinase is selected from the group 

consisting of FGF-R, PDGF-R , KDR, CCK4 , MET , TRKA, AXL, 
TIE, EPH, RYK, DDR, ROS , RET, LTK , R0R1, and MUSK. 



28. The polypeptide of claim 26, wherein said non- 
20 receptor kinase is selected from the group consisting of 

SRC, BRK, BTK, CSK, ABL, ZAP70 , FES, FAX, JAK, and ACK. 

29. The polypeptide of claim 24 having the amino 
acid sequence shown in SEQ ID NO: 4. 

25 

30. A method of using the polypeptide of claim 24 
to form a crystal, comprising the steps of: 

(a) mixing a volume of polypeptide solution 
with a reservoir solution; and 
30 (b) incubating the mixture obtained in step 

(a) over the reservoir solution in a closed container, 
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under conditions suitable tor crystallization. 

31. A method of obtaining an FGF receptor tyrosine 
kinase domain polypeptide in crystalline form, 
comprising the steps of: 

(a) mixing a volume of polypeptide solution 
with an equal volume of reservoir solution, wherein said 
polypeptide solution comprises 1 mg/mL to 60 mg/mL FGF- 
type tyrosine kinase domain protein, 10 mM to 200 mM 
buffering agent, 0 mM to 20 mM dithiothreitol and has a 
pH of about 5.5 to about 7.5, and wherein said reservoir 
solution comprises 10% to 30% (w/v) polyethylene glycol, 
0.1 M to 0.5 M ammonium sulfate, 0% to 20% (w/v) 
ethylene glycol or glycerol, 10 mM to 200 mM buffering 
agent and has a pH of about 5.5 to about 7.5; and 

(b) incubating the mixture obtained in step 
(a) over said reservoir solution in a closed container 
at a temperature between 0° and 25° °C until crystals 
form . 

32. The method of claim 31, wherein said 
polypeptide solution comprises about 10 mg/mL FGF 
receptor tyrosine kinase domain, about 10 mM sodium 
chloride, about 2 mM dithiothreitol, about 10 mM Tris- 
HC1 and has a pH of about 8; the reservoir buffer 
comprises about 16% (w/v) polyethylene glycol (MW 
10000), about 0.3 M ammonium sulfate, about 5% ethylene 
glycol or glycerol, about 100 mM bis-Tris and has a pH 
of about 6.5; and the temperature is about 4°C. 

33. The method of claim 31, wherein said 
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polypeptide solution comprises a compound. 

34. A cDNA encoding an FGF receptor tyrosine 
kinase domain protein, wherein a coding strand of the 
cDNA has the nucleotide sequence of SEQ ID NO : 5 . 

35. A method of determining three dimensional 
structures of protein tyrosine kinases with unknown 
structure comprising the step of applying structural 
atomic coordinates set forth in Table 1, Table 2, Table 
3 , or Table 4 . 

36. The method of claim 35, comprising the 
following steps: 

(a) aligning a first computer representation 
of an amino acid sequence of a protein tyrosine kinase 
of unknown structure with a second computer 
representation of a protein tyrosine kinase of known 
structure by matching homologous regions of amino acid 
sequences of said first computer representation and said 
second computer representation; 

(b) transferring computer representations of 
amino acid structures in said protein tyrosine kinase of 
known structure to computer representations of 
corresponding amino acid structures in said protein 
tyrosine kinase with unknown structure; and 

(c) determining a low energy conformation of 
the protein tyrosine kinase structure resulting from 
step (b) . 

37. The method of claim 35, comprising the 
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following steps: 

(a) aligning the positions of atoms in the 
unit cell by matching electron diffraction data from two 
crystals, and 

(b) determining a low energy conformation of 
the resulting protein tyrosine kinase structure. 

38. The method of claim 35, comprising the 
following steps: 

(a) determining the secondary structure of a 
protein tyrosine kinase structure using NMR data; and 

(b) simplifying the assignment of through- 
space interactions of amino acids. 

39. The method of any one of claims 35, 36, 37, or 
38, wherein said protein tyrosine kinase with or without 
known structure is a receptor protein tyrosine kinase. 

40. The method of claim 39, wherein said receptor 
protein tyrosine kinase with or without known structure 
is selected from the group consisting of FGF-R, PDGF-R, 
FLK, CCK4 , MET, TRKA, AXL, TIE, EPH, RYK, DDR, ROS , RET, 
LTK, ROR1, and MUSK. 

41. The method of anyone of claims 35, 36, 37, or 
38, wherein said protein tyrosine kinase with or without 
known structure is a non-receptor protein tyrosine 
kinase . 

42. The method of claim 41, wherein said protein 
tyrosine kinase with or without known structure is 
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selected from the group consisting of SRC , BRK, 3TK , 
CSK, ABL , ZAP7 0 , FES , FAK , JAK , and ACK . 



43. A method of identifying a potential modulator 
5 of protein tyrosine kinase function by docking a 

computer representation of a structure of a compound 
with a computer representation of a structure of a 
cavity formed by the active-site of a protein tyrosine 
kinase, wherein said structure of said protein tyrosine 
10 kinase is defined by atomic structural coordinates set 

forth in Table 1, Table 2, Table 3, or Table 4. 



44. The method of claim 43, comprising the 
following steps: 

15 (a) removing a computer representation of a 

compound complexed with a protein tyrosine kinase and 
docking a computer representation of a compound from a 
computer data base with a computer representation of the 
active-site of the protein tyrosine kinase; 

20 (b) determining a conformation of the complex 

resulting from step (a) with a favorable geometric fit 
and favorable complementary interactions; and 

(c) identifying compounds that best fit said 
active-site as potential modulators of protein tyrosine 

25 kinase function. 



45. The method of claim 43, comprising the 
following steps: 

(a) modifying a computer representation of 
30 compound complexed with a protein tyrosine kinase by the 

deletion of a chemical group or groups or by the 
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addition of a chemical group or groups ; 

(b) determining a conformation of the complex 
resulting from step (a) with a favorable geometric fit 
and favorable complementary interactions; and 
5 (c) identifying compounds that best fit the 

protein tyrosine kinase active-site as potential 
modulators of protein tyrosine kinase function. 

46. The method of claim 43, wherein said method 
comprises the following steps: 

(a) removing a computer representation of a 
compound complexed with a protein tyrosine kinase; and 

(b) searching a data base for data base 
compounds similar to said compounds using a compound 
searching computer program or replacing portions of said 
compound with similar chemical structures from a data 
base using a compound construction computer program. 

47. The method of any one of claims 43, 44, 45, or 
20 46, wherein said protein tyrosine kinase is a receptor 

protein tyrosine kinase. 

48. The method of claim 47, wherein said receptor 
protein tyrosine kinase is selected from the group 

25 consisting of FGF-R, PDGF-R, FLK, CCK4 , MET, TRKA , AXL , 

TIE, EPH, RYK, DDR, ROS, RET, LTK, ROR1 , and MUSK. 



10 



15 



30 



49. The method of anyone of claims 43, 44, 45, or 
46, wherein said protein tyrosine kinase is a non- 
receptor protein tyrosine kinase. 
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50. The method of claim 49, wherein said protein 
tyrosine kinase is selected from the group consisting of 
SRC, BRK, BTK, CSK, ABL. , ZAP70, FES , FAK , JAK , and ACK . 

51. a potential modulator of protein tyrosine 
kinase function identified by the method of any one of 
claims 43, 44, 45, or 46. 

52. The potential modulator of claim 51, wherein 
said modulator is selected from a computer data base. 

53. The potential modulator of claim 51, wherein 
said modulator is constructed from chemical groups 
selected from a computer data base. 

54. The potential modulator of protein tyrosine 
kinase function of claim 51, wherein said modulator is 
an indolinone compound of formula I or II: 
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or a pharmaceutical^ acceptable salt, isomer, 
metabolite, ester, amide, or prodrug thereof, wherein 

(a) A lf A 2 , A 3 , and A 4 are independently carbon or 
nitrogen ; 

(b) R : is hydrogen or alkyl; 

(c) R 2 is oxygen in the case of an oxindolinone or 
sulfur in the case of a thiolindolmone ; 

(d) R 3 is hydrogen; 

(e) R 4 , R s , R 6 , and R 7 are optionally present and are 
either (i) independently selected from the group 
consisting of hydrogen, alkyl, alkoxy, aryl, aryloxy, 
alkaryl, aikaryloxy, halogen, trihalomethyl , S(0)R, 
S0 2 NRR\ S0 3 R, SR, N0 2/ NRR 1 , OH, CN, C(0)R, 0C(0)R, 
NHC(0)R, (CH 2 ) n C0 2 R, and CONRR 1 or (ii) any two adjacent 

Ks> R 6/ anc * R 7 taken together form a fused ring with 
the aryl portion of the oxindole-based portion of the 
indolinone ,- 

(f) R 2 \ R 3 \ R 4 ', R s \ and R 6 • are each 
independently selected from the group consisting of 
hydrogen, alkyl, alkoxy, aryl, aryloxy, alkaryl, 
alkaryloxy, halogen, trihalomethyl, S(0)R, S0 ? NRR', S0 3 R, 
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SR, NO,, NRR 1 , OH, CN , C(0)R, 0C(O)H f NHCfOJR, ( CH , n CO ? H , 
and CONRR 1 ,- 

(g) n is 0, 1, 2, or 3; 

(h) R is hydrogen, alkyl or aryl; 

5 (i) R' is hydrogen, alkyi or aryl; and 

(j) A is a five membered heteroaryi ring selected 
from the group consisting of thiophene , pyrrole, 
pyrazole , imidazole , 1,2, 3 - triazole , 1 , 2 , 4 - 1 riazole , 
oxazole, isoxazole, thiazole, isothiazole, furan, 1,2,3- 
10 oxadiazole, 1 , 2 , 4 -oxadiazole , 1 , 2 , 5 -oxadiazole , 1,3,4- 

oxadiazole , 1,2,3, 4 -oxat riazole , 1,2,3, 5 -oxat riazole , 

1.2.3- thiadiazole, 1,2,4 -thiadiazole , 1,2, 5 - thiadiazole , 

1. 3 . 4- thiadiazole, 1,2,3 , 4- thiat riazole, 1,2,3,5- 
thiat riazole , and tetrazole, optionally substituted at 

15 one or more positions with alkyl, alkoxy, aryl, aryloxy, 

alkaryl, alkaryloxy, halogen, t rihalomethyl , S(0)R, 
SO^NRR', S0 3 R, SR, N0 2 , NRR', OH, CN, C(0)R, 0C(0)R, 
NHC(0)R, (CH 2 ) n CO ? R or CONRR'. 

20 55. A method of identifying a potential modulator 

of protein tyrosine kinase function as a modulator of 
protein tyrosine kinase function, comprising the 
following steps: 

(a) administering said potential modulator to 

25 cells; 

(b) comparing the level of protein tyrosine 
kinase phosphorylation between cells net administered 
the potential modulator and cells administered said 
potential modulator; and 

30 (c) identifying said potential modulator as a 

modulator of protein tyrosine kinase function based on 
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the difference m the level of protein tyrosine kinase 
phosphorylat ion . 

56. A method of identifying a potential modulator 
5 of protein tyrosine kinase function as a modulator of 

protein tyrosine kinase function, wherein said method 

comprises the following steps: 

(a) administering a preparation of said 

potential modulator to cells; 
10 (b) comparing the rate of cell growth between 

cells not administered the modulator and cells 

administered the modulator; and 

(c) identifying said potential modulator as a 

modulator of protein tyrosine kinase function based on 
15 the difference in the rate of cell growth. 

57. A method of treating a disease associated with 
a protein tyrosine kinase with inappropriate activity in 
a cellular organism, wherein said method comprises the 

20 steps of: 

(a) administering a modulator of protein 
tyrosine kinase function to the organism, wherein said 
modulator is in an acceptable pharmaceutical 

preparation; and 
2 5 (b) activating or inhibiting the protein 

tyrosine kinase function to treat the disease. 



30 



58. The method of any one of claims 55, 56, or 57, 
wherein said protein tyrosine kinase is a receptor 
protein tyrosine kinase. 
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59. The method of claim 58, wherein said receptor 
protein tyrosine kinase is selected from the group 
containing FGF-R, PDGF-R, FLK , CCK4 , MET , TRKA, AXL, 
TIE , EPH, RYK , DDR, ROS , RET, LTK, R0R1 , and MUSK. 

5 

60. The method of any one of claims 55, 56, or 57, 
wherein said protein tyrosine kinase is a non-receptor 
protein tyrosine kinase. 

10 61. The method of claim 60, wherein said non- 

receptor protein tyrosine kinase is selected from a 
group consisting of SRC, BRK, BTK, CSK, ABL, ZAP70 , FES , 
FAK, JAK, and ACK . 
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